9.1 KiB
Task 14: PMSI Validator and Question Generator - Implementation Summary
Overview
Successfully implemented Task 14 which includes the PMSI Validator and Question Generator components for the medical coding pipeline. These components are critical for validating coding proposals, detecting errors, and generating questions for missing information.
Components Implemented
1. PMSIValidator (src/pipeline_mco_pmsi/validators/pmsi_validator.py)
Responsibilities:
- Generate categorized validation problems (bloquant/à_revoir/info)
- Detect missing mandatory information
- Validate conformity to eligibility criteria from Guide Méthodologique
- Detect zero-tolerance errors
- Block automatic validation when critical issues are found
Key Features:
- Mandatory Information Detection: Checks for missing DP, documents, facts, and evidence
- Eligibility Criteria Validation: Integrates with RAG Engine to retrieve and validate eligibility criteria for DP and DAS codes
- Code Consistency Checks: Verifies codes match clinical facts and detects uncoded diagnostics
- Zero-Tolerance Error Detection: Identifies 8 types of critical errors:
- Negated diagnoses coded as affirmed
- Suspected diagnoses coded as certain (especially for DP)
- CCAM acts without explicit evidence
- Medical history coded as current episode
- Unknown referentiel versions
- High confidence on ambiguous cases
- Gross DP/DAS inversions
- PII leaks in logs/exports
Methods:
validate_proposal(): Main validation entry pointcheck_zero_tolerance_errors(): Detects critical errorshas_blocking_issues(): Checks for blocking problemsshould_block_automatic_validation(): Determines if validation should be blocked
Requirements Satisfied: 9.1, 9.2, 26.5, 19.1-19.9
2. QuestionGenerator (src/pipeline_mco_pmsi/validators/question_generator.py)
Responsibilities:
- Generate prioritized questions (maximum 5)
- Detect inconsistencies between codes and clinical facts
- Prioritize questions by impact on coding accuracy
Key Features:
-
Question Sources:
- Validation issues (blocking and review)
- Suspected clinical facts
- Code/fact inconsistencies
- Low confidence codes
- Document contradictions
-
Prioritization System:
- Priority levels: 1 (high) to 5 (low)
- Category ordering: contradiction > missing_info > clarification > confirmation
- Automatic limiting to MAX_QUESTIONS (5)
-
Inconsistency Detection:
- Negated facts with proposed codes
- Contradictions between documents
- Suspected diagnoses requiring confirmation
- Low confidence codes requiring validation
Methods:
generate_questions(): Main question generation entry point_detect_inconsistencies(): Finds code/fact inconsistencies_detect_document_contradictions(): Identifies multi-document contradictions_prioritize_and_limit(): Sorts and limits questions to top 5
Requirements Satisfied: 9.3, 9.4
3. Blocking Logic (Integrated in PMSIValidator)
Responsibilities:
- Block automatic validation when blocking issues detected
- Block automatic validation when zero-tolerance errors detected
Key Features:
- Comprehensive zero-tolerance error checking
- Clear blocking decision logic
- Detailed logging of blocking reasons
Requirements Satisfied: 9.6, 19.9
Test Coverage
PMSIValidator Tests (tests/test_pmsi_validator.py)
20 tests covering:
- Basic initialization and validation
- Missing mandatory information detection (DP, documents, facts, evidence)
- Eligibility criteria validation (retrieval, no criteria, exclusion rules)
- Zero-tolerance error detection (all 8 types)
- Blocking logic (blocking issues, zero-tolerance, no issues)
Test Results: ✅ 20/20 passing (100%)
Coverage: 88% of pmsi_validator.py
QuestionGenerator Tests (tests/test_question_generator.py)
13 tests covering:
- Basic initialization and question generation
- Question generation from various sources
- Inconsistency detection (negated facts, document contradictions)
- Question prioritization and limiting
Test Results: ✅ 13/13 passing (100%)
Coverage: 86% of question_generator.py
Integration Points
RAG Engine Integration
- PMSIValidator uses
rag_engine.retrieve_eligibility_criteria()to fetch eligibility criteria from Guide Méthodologique - Validates codes against retrieved criteria
- Generates warnings for exclusion and hierarchization rules
Data Models Used
ValidationIssue: Represents validation problems with severity and categoryQuestion: Represents generated questions with priority and contextEligibilityCriteria: Contains eligibility rules from Guide MéthodologiqueCodingProposal: Input containing proposed codesStructuredStay: Input containing clinical facts and documents
Key Design Decisions
-
Conservative Approach: The validator is designed to be conservative, preferring to flag potential issues rather than miss critical errors
-
Separation of Concerns:
- PMSIValidator focuses on validation and error detection
- QuestionGenerator focuses on question generation and prioritization
- Clear separation makes testing and maintenance easier
-
Extensibility: Both classes are designed to be easily extended with new validation rules or question types
-
Integration with RAG: Eligibility criteria validation leverages the RAG Engine for dynamic rule retrieval
-
Pydantic Validation: Leverages Pydantic models for data validation, ensuring type safety and data integrity
Files Created/Modified
Created:
src/pipeline_mco_pmsi/validators/pmsi_validator.py(222 lines)src/pipeline_mco_pmsi/validators/question_generator.py(122 lines)tests/test_pmsi_validator.py(745 lines)tests/test_question_generator.py(485 lines)
Modified:
src/pipeline_mco_pmsi/validators/__init__.py- Added exports for new classes
Requirements Traceability
| Requirement | Component | Status |
|---|---|---|
| 9.1 - Categorized validation problems | PMSIValidator | ✅ Implemented |
| 9.2 - Missing mandatory info detection | PMSIValidator | ✅ Implemented |
| 9.3 - Prioritized questions (max 5) | QuestionGenerator | ✅ Implemented |
| 9.4 - Code/fact inconsistency detection | QuestionGenerator | ✅ Implemented |
| 9.6 - Block validation on blocking issues | PMSIValidator | ✅ Implemented |
| 19.1 - Prevent negated coded as affirmed | PMSIValidator | ✅ Implemented |
| 19.2 - Prevent suspected as certain | PMSIValidator | ✅ Implemented |
| 19.3 - Prevent CCAM without evidence | PMSIValidator | ✅ Implemented |
| 19.4 - Prevent history as current | PMSIValidator | ✅ Implemented |
| 19.5 - Prevent DP/DAS inversions | PMSIValidator | ✅ Implemented |
| 19.6 - Prevent unknown referentiel | PMSIValidator | ✅ Implemented |
| 19.7 - Prevent PII leaks | PMSIValidator | ✅ Implemented |
| 19.8 - Prevent high confidence ambiguous | PMSIValidator | ✅ Implemented |
| 19.9 - Block on zero-tolerance errors | PMSIValidator | ✅ Implemented |
| 26.5 - Validate eligibility criteria | PMSIValidator | ✅ Implemented |
Usage Example
from pipeline_mco_pmsi.validators import PMSIValidator, QuestionGenerator
from pipeline_mco_pmsi.rag.rag_engine import RAGEngine
# Initialize components
rag_engine = RAGEngine(referentiels_manager)
pmsi_validator = PMSIValidator(rag_engine=rag_engine)
question_generator = QuestionGenerator()
# Validate a coding proposal
validation_issues = pmsi_validator.validate_proposal(
proposal=coding_proposal,
structured_stay=structured_stay
)
# Check for zero-tolerance errors
zero_tolerance_issues = pmsi_validator.check_zero_tolerance_errors(
proposal=coding_proposal,
structured_stay=structured_stay
)
# Determine if validation should be blocked
should_block = pmsi_validator.should_block_automatic_validation(
validation_issues=validation_issues,
zero_tolerance_issues=zero_tolerance_issues
)
# Generate questions for missing information
questions = question_generator.generate_questions(
proposal=coding_proposal,
structured_stay=structured_stay,
validation_issues=validation_issues
)
# Process results
if should_block:
print(f"Validation blocked: {len(validation_issues)} issues, {len(zero_tolerance_issues)} critical errors")
print(f"Questions to resolve: {len(questions)}")
else:
print("Validation passed")
Next Steps
The following tasks remain in the pipeline:
- Task 15: Implement Audit Logger for complete traceability
- Task 16: Implement main Pipeline orchestration
- Task 17-30: Additional features (rules management, metrics, deployment, etc.)
Conclusion
Task 14 has been successfully completed with:
- ✅ All 3 subtasks implemented (14.1, 14.2, 14.3)
- ✅ 33 unit tests passing (100% pass rate)
- ✅ 87% average code coverage
- ✅ All requirements satisfied
- ✅ Integration with RAG Engine working
- ✅ Zero-tolerance error detection comprehensive
- ✅ Question generation and prioritization functional
The PMSI Validator and Question Generator are now ready for integration into the main pipeline and provide robust validation and question generation capabilities for the medical coding system.