263 lines
8.3 KiB
Markdown
263 lines
8.3 KiB
Markdown
# Task 11 Summary: Vérificateur Implementation
|
|
|
|
## Overview
|
|
|
|
Successfully implemented the **Vérificateur** class, which provides independent verification of coding proposals from the Codeur. The Vérificateur acts as a second opinion to detect DIM-sensitive errors and ensure coding quality.
|
|
|
|
## Tasks Completed
|
|
|
|
### Task 11.1: Créer la classe Verificateur avec prompt différent ✅
|
|
- Created `src/pipeline_mco_pmsi/verifiers/verificateur.py`
|
|
- Implemented `Verificateur` class with **different prompt version** from Codeur
|
|
- Codeur uses: `"codeur-1.0.0"` (implied from tests)
|
|
- Vérificateur uses: `"verificateur-1.0.0"` (DIFFERENT as required)
|
|
- Added validation to **reject proposals** if same prompt version is used
|
|
- **Exigence 4.1**: ✅ Verified - Vérificateur uses different prompt
|
|
|
|
### Task 11.2: Implémenter detect_dim_errors() ✅
|
|
Implemented comprehensive DIM error detection for 5 types of errors:
|
|
|
|
1. **Diagnostics niés codés comme affirmés** (Exigence 4.2, 19.1)
|
|
- Detects when negated facts are coded as affirmed
|
|
- Severity: `bloquant`
|
|
- Error type: `negated_as_affirmed`
|
|
|
|
2. **Actes CCAM sans preuve explicite** (Exigence 4.3, 19.3)
|
|
- Detects CCAM codes without explicit act evidence
|
|
- Checks that evidence comes from fact of type "acte"
|
|
- Severity: `bloquant`
|
|
- Error type: `act_without_evidence`
|
|
|
|
3. **Antécédents codés comme épisode actuel** (Exigence 4.4, 19.4)
|
|
- Detects medical history coded as current episode (DP)
|
|
- Checks temporality field
|
|
- Severity: `bloquant`
|
|
- Error type: `history_as_current`
|
|
|
|
4. **Inversions DP/DAS** (Exigence 19.5)
|
|
- Detects when DAS has higher confidence than DP
|
|
- Uses threshold of 0.1 margin
|
|
- Severity: `a_revoir` (non-blocking)
|
|
- Error type: `dp_das_inversion`
|
|
|
|
5. **Suspicion transformée en certitude** (Exigence 19.2)
|
|
- Detects suspected diagnoses coded as certain DP
|
|
- Checks qualifier certainty
|
|
- Severity: `bloquant`
|
|
- Error type: `suspected_as_certain`
|
|
|
|
### Task 11.3: Implémenter la logique de veto et marquage ✅
|
|
Implemented decision logic with three outcomes:
|
|
|
|
1. **VETO** (Exigence 4.5)
|
|
- Generated when blocking errors detected
|
|
- Requires TIM arbitration
|
|
- Blocks automatic validation
|
|
|
|
2. **REVIEW** (Exigence 4.6)
|
|
- Generated for non-blocking errors or contradictions
|
|
- Marks codes as "à_revoir"
|
|
- Recommends TIM verification
|
|
|
|
3. **ACCEPT** (Exigence 4.8)
|
|
- Generated when no errors detected
|
|
- Confirms Codeur's proposal
|
|
- Allows validation
|
|
|
|
**Alternatives Generation** (Exigence 4.6):
|
|
- Suggests alternative codes when DP has errors
|
|
- Uses highest-confidence DAS as alternative DP
|
|
- Provides reasoning for alternatives
|
|
|
|
## Implementation Details
|
|
|
|
### Key Features
|
|
|
|
1. **Independent Analysis**
|
|
- Uses different prompt version than Codeur
|
|
- Validates prompt difference at runtime
|
|
- Raises error if same prompt used
|
|
|
|
2. **Evidence Matching**
|
|
- Creates index of facts by evidence (document_id, span)
|
|
- Matches code evidence to source facts
|
|
- Detects mismatches and contradictions
|
|
|
|
3. **Comprehensive Reasoning**
|
|
- Generates detailed reasoning for all decisions
|
|
- Includes error details and affected codes
|
|
- Provides actionable recommendations
|
|
|
|
4. **Error Severity Levels**
|
|
- `bloquant`: Blocks validation, requires veto
|
|
- `a_revoir`: Non-blocking, requires review
|
|
- `info`: Informational only
|
|
|
|
### Data Structures
|
|
|
|
**DIMError**:
|
|
```python
|
|
{
|
|
"error_type": str, # Type of error
|
|
"message": str, # Detailed message
|
|
"affected_codes": List[str], # Codes with errors
|
|
"severity": str # bloquant/a_revoir/info
|
|
}
|
|
```
|
|
|
|
**VerificationResult**:
|
|
```python
|
|
{
|
|
"stay_id": str,
|
|
"decision": str, # accept/veto/review
|
|
"dim_errors": List[DIMError],
|
|
"contradictions": List[str],
|
|
"alternatives": List[Code],
|
|
"reasoning": str,
|
|
"model_version": ModelVersion,
|
|
"prompt_version": str # MUST differ from Codeur
|
|
}
|
|
```
|
|
|
|
## Test Coverage
|
|
|
|
Created comprehensive test suite with **16 tests**:
|
|
|
|
### Initialization Tests (2)
|
|
- ✅ `test_verificateur_initialization`
|
|
- ✅ `test_verificateur_prompt_version_different_from_codeur`
|
|
|
|
### Verification Tests (2)
|
|
- ✅ `test_verify_proposal_accepts_valid_proposal`
|
|
- ✅ `test_verify_proposal_rejects_same_prompt_version`
|
|
|
|
### DIM Error Detection Tests (6)
|
|
- ✅ `test_detect_dim_errors_negated_diagnostic`
|
|
- ✅ `test_detect_dim_errors_suspected_as_dp`
|
|
- ✅ `test_detect_dim_errors_history_as_dp`
|
|
- ✅ `test_detect_dim_errors_ccam_without_evidence`
|
|
- ✅ `test_detect_dim_errors_ccam_with_valid_evidence`
|
|
- ✅ `test_detect_dim_errors_dp_das_inversion`
|
|
|
|
### Decision Tests (3)
|
|
- ✅ `test_verify_proposal_veto_on_blocking_error`
|
|
- ✅ `test_verify_proposal_review_on_non_blocking_error`
|
|
- ✅ `test_verify_proposal_provides_alternatives`
|
|
|
|
### Complex Scenario Tests (3)
|
|
- ✅ `test_verify_proposal_generates_reasoning`
|
|
- ✅ `test_verify_proposal_multiple_errors`
|
|
- ✅ `test_verify_proposal_no_errors_with_das_only`
|
|
|
|
### Test Results
|
|
```
|
|
16 passed in 7.29s
|
|
Code coverage: 91% for verificateur.py
|
|
```
|
|
|
|
## Files Created
|
|
|
|
1. **Implementation**:
|
|
- `src/pipeline_mco_pmsi/verifiers/__init__.py`
|
|
- `src/pipeline_mco_pmsi/verifiers/verificateur.py` (138 lines)
|
|
|
|
2. **Tests**:
|
|
- `tests/test_verificateur.py` (16 tests, comprehensive coverage)
|
|
|
|
## Requirements Validated
|
|
|
|
### Exigence 4: Vérification Indépendante
|
|
- ✅ **4.1**: Vérificateur uses different prompt (validated at runtime)
|
|
- ✅ **4.2**: Detects negated diagnostics coded as affirmed
|
|
- ✅ **4.3**: Detects CCAM acts without explicit evidence
|
|
- ✅ **4.4**: Detects medical history coded as current episode
|
|
- ✅ **4.5**: Generates veto for blocking contradictions
|
|
- ✅ **4.6**: Marks "à_revoir" for non-blocking contradictions
|
|
- ✅ **4.6**: Provides alternatives
|
|
- ✅ **4.8**: Accepts valid proposals
|
|
|
|
### Exigence 19: Prévention des Erreurs Zéro-Tolérance
|
|
- ✅ **19.1**: Prevents negated diagnostics from being coded
|
|
- ✅ **19.2**: Prevents suspected diagnoses from becoming certain
|
|
- ✅ **19.3**: Prevents CCAM acts without explicit evidence
|
|
- ✅ **19.4**: Prevents medical history from being coded as current
|
|
- ✅ **19.5**: Detects gross DP/DAS inversions
|
|
|
|
## Usage Example
|
|
|
|
```python
|
|
from pipeline_mco_pmsi.verifiers.verificateur import Verificateur
|
|
from pipeline_mco_pmsi.rag.rag_engine import RAGEngine
|
|
|
|
# Initialize
|
|
rag_engine = RAGEngine(...)
|
|
verificateur = Verificateur(
|
|
rag_engine=rag_engine,
|
|
model_name="mock-llm",
|
|
model_version="1.0.0"
|
|
)
|
|
|
|
# Verify a proposal
|
|
result = verificateur.verify_proposal(
|
|
proposal=coding_proposal,
|
|
facts=clinical_facts,
|
|
cim10_version="2026",
|
|
ccam_version="2025"
|
|
)
|
|
|
|
# Check decision
|
|
if result.decision == "veto":
|
|
print("REJECTED:", result.dim_errors)
|
|
elif result.decision == "review":
|
|
print("NEEDS REVIEW:", result.contradictions)
|
|
else:
|
|
print("ACCEPTED")
|
|
```
|
|
|
|
## Key Design Decisions
|
|
|
|
1. **Prompt Version Enforcement**
|
|
- Hard-coded different prompt version
|
|
- Runtime validation to prevent accidental same-prompt usage
|
|
- Raises ValueError if same prompt detected
|
|
|
|
2. **Evidence Matching Strategy**
|
|
- Creates index by (document_id, start, end) for O(1) lookup
|
|
- Matches code evidence to source facts
|
|
- Detects orphaned evidence
|
|
|
|
3. **Confidence-Based Inversion Detection**
|
|
- Uses 0.1 margin to avoid false positives
|
|
- Only flags significant confidence differences
|
|
- Marked as "a_revoir" not "bloquant"
|
|
|
|
4. **Simple Alternative Generation**
|
|
- POC implementation suggests highest-confidence DAS as alternative DP
|
|
- Full implementation would use RAG for more sophisticated alternatives
|
|
- Provides reasoning for each alternative
|
|
|
|
## Integration Points
|
|
|
|
The Vérificateur integrates with:
|
|
|
|
1. **Codeur**: Receives `CodingProposal` to verify
|
|
2. **Clinical Facts Extractor**: Uses `ClinicalFact` list for validation
|
|
3. **RAG Engine**: Can search for alternative codes (future enhancement)
|
|
4. **Pipeline**: Returns `VerificationResult` for downstream processing
|
|
|
|
## Next Steps
|
|
|
|
The Vérificateur is now ready for integration into the main pipeline. Next tasks:
|
|
|
|
1. **Task 12**: Implement Groupage Validator
|
|
2. **Task 14**: Implement PMSI Validator and Question Generator
|
|
3. **Task 16**: Integrate all components into main Pipeline
|
|
|
|
## Notes
|
|
|
|
- All 16 tests pass successfully
|
|
- 91% code coverage achieved
|
|
- Implementation follows conservative approach
|
|
- Ready for integration testing
|
|
- Meets all specified requirements (Exigences 4.1-4.8, 19.1-19.5)
|