tests: alias DLBCL + garde-fou Trackare + e2e PDFs réels + gold CRH + benchmark enrichi
- 11 tests unitaires : TestAliasAndConclusionBonus (7) + TestTrackareSymptomGuard (4) - Tests e2e sur PDFs réels (skip si absent) : méningite A87.0 + DLBCL C83.3 top1 - Gold CRH enrichi : 5 cas (2 réels ajoutés : 115_23066188, 132_23080179) - Benchmark synthese : récupération conclusion depuis source_excerpt des DAS/traitements - .gitignore : protection anti-PHI (real_crh_pdfs/, data/crh_samples/*.pdf) - docs/PHI_POLICY.md : 7 règles de sécurité PHI - Rapports debug : case 132 REVIEW (garde-fou actif), top errors, DIM pack 1043 tests passent, 0 régression. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
21
docs/gold_debug/DIM_PACK_20260224.csv
Normal file
21
docs/gold_debug/DIM_PACK_20260224.csv
Normal file
@@ -0,0 +1,21 @@
|
||||
case_id,document_type,chosen_code,chosen_term,verdict,confidence,dp_expected_code,dp_expected_label,dp_acceptable_codes,dp_acceptable_family3,allow_symptom_dp,confidence_gold,notes
|
||||
132_23080179,trackare,R59.0,Adénopathie,REVIEW,medium,C83.3,Lymphome diffus à grandes cellules B,,C83,False,probable,
|
||||
74_23141536,crh,D50,Anémie,REVIEW,medium,I25.1,Syndrome coronarien aigu,I25.1|I25.5,I25,False,probable,
|
||||
99_23033146,trackare,E66.83,Obésité (IMC 30.408),REVIEW,medium,,,,,,,
|
||||
106_23056475,trackare,I26.9,Embolie pulmonaire,REVIEW,medium,I26.9,Embolie pulmonaire,I26.0|I26.9,I26,False,certain,
|
||||
111_23061304,trackare,N19,Insuffisance rénale,REVIEW,medium,,,,,,,
|
||||
112_23065936,trackare,I25.5,Cardiopathie ischémique,REVIEW,medium,,,,,,,
|
||||
120_23033508,trackare,N85.7,Hématome,REVIEW,medium,,,,,,,
|
||||
139_23087691,trackare,M16.7,Coxarthrose,REVIEW,medium,,,,,,,
|
||||
140_23090475,trackare,Z54.8,Convalescence,REVIEW,medium,,,,,,,
|
||||
149_23089771,trackare,H16.0,C omprend décollement de la (de la) : • conjonctive,REVIEW,medium,,,,,,,
|
||||
153_23102610,trackare,T83.5,Infection urinaire,REVIEW,medium,,,,,,,
|
||||
159_23107113,trackare,I26.9,Embolie pulmonaire,REVIEW,medium,,,,,,,
|
||||
160_23099448,trackare,E88.1,Lipodystrophie,REVIEW,medium,,,,,,,
|
||||
170_23077016,trackare,K59.0,Constipation,REVIEW,medium,,,,,,,
|
||||
174_23080042,trackare,Q40.1,Hernie hiatale ce,REVIEW,medium,,,,,,,
|
||||
183_23087212,trackare,T83.5,Infection urinaire,REVIEW,medium,,,,,,,
|
||||
192_23132490,trackare,D50,Anémie,REVIEW,medium,,,,,,,
|
||||
200_23149959,trackare,I80.2,Thrombose veineuse profonde,REVIEW,medium,,,,,,,
|
||||
225_23160703,trackare,N85.7,Hématome,REVIEW,medium,,,,,,,
|
||||
25_23127187,trackare,N19,Insuffisance rénale,REVIEW,medium,,,,,,,
|
||||
|
6
docs/gold_debug/NUKE3_GOLD_TOP_ERRORS.csv
Normal file
6
docs/gold_debug/NUKE3_GOLD_TOP_ERRORS.csv
Normal file
@@ -0,0 +1,6 @@
|
||||
case_id,document_type,chosen_code,chosen_term,verdict,confidence,expected_code,acceptable_codes,acceptable_family3,strict_match,acceptable_match,family3_match,symptom_not_allowed,raw_pool_size,filtered_pool_size,topk_size,evidence_count,review_reason_tag,top1_score,top2_score,delta_top1_top2,top3_codes,top3_terms
|
||||
132_23080179,trackare,R59.0,Adénopathie,REVIEW,medium,C83.3,,C83,False,False,False,True,23,0,0,2,other,0,0,0,,
|
||||
74_23141536,crh,D50,Anémie,REVIEW,medium,I25.1,I25.1|I25.5,I25,False,False,False,False,3,3,3,1,low_delta,4.0,4.0,0.0,D50|I25.1|Z95.5,Anémie|SCA (Syndrome Coronarien Aigu)|Stent vasculaire
|
||||
115_23066188,trackare,A87.0,Méningite à entérovirus,CONFIRMED,high,A87.0,,A87,True,True,True,False,6,0,0,1,other,0,0,0,,
|
||||
106_23056475,trackare,I26.9,Embolie pulmonaire,REVIEW,medium,I26.9,I26.0|I26.9,I26,True,True,True,False,10,7,7,1,low_delta,6.0,5.0,1.0,I26.9|I26.9|Q53.9,Embolie pulmonaire|Embolie pulmonaire|Cryptorchidie
|
||||
73_23139637,trackare,R06.0,Dyspnée,REVIEW,medium,R06.0,,R06,True,True,True,False,1,1,1,1,mono_fragile,1.0,0,1.0,R06.0,Dyspnée
|
||||
|
5
docs/gold_debug/NUKE3_GOLD_TOP_ERRORS.jsonl
Normal file
5
docs/gold_debug/NUKE3_GOLD_TOP_ERRORS.jsonl
Normal file
@@ -0,0 +1,5 @@
|
||||
{"case_id": "132_23080179", "document_type": "trackare", "chosen_code": "R59.0", "chosen_term": "Adénopathie", "verdict": "REVIEW", "confidence": "medium", "expected_code": "C83.3", "acceptable_codes": "", "acceptable_family3": "C83", "strict_match": false, "acceptable_match": false, "family3_match": false, "symptom_not_allowed": true, "raw_pool_size": 23, "filtered_pool_size": 0, "topk_size": 0, "evidence_count": 2, "review_reason_tag": "other", "top1_score": 0, "top2_score": 0, "delta_top1_top2": 0, "top3_codes": "", "top3_terms": ""}
|
||||
{"case_id": "74_23141536", "document_type": "crh", "chosen_code": "D50", "chosen_term": "Anémie", "verdict": "REVIEW", "confidence": "medium", "expected_code": "I25.1", "acceptable_codes": "I25.1|I25.5", "acceptable_family3": "I25", "strict_match": false, "acceptable_match": false, "family3_match": false, "symptom_not_allowed": false, "raw_pool_size": 3, "filtered_pool_size": 3, "topk_size": 3, "evidence_count": 1, "review_reason_tag": "low_delta", "top1_score": 4.0, "top2_score": 4.0, "delta_top1_top2": 0.0, "top3_codes": "D50|I25.1|Z95.5", "top3_terms": "Anémie|SCA (Syndrome Coronarien Aigu)|Stent vasculaire"}
|
||||
{"case_id": "115_23066188", "document_type": "trackare", "chosen_code": "A87.0", "chosen_term": "Méningite à entérovirus", "verdict": "CONFIRMED", "confidence": "high", "expected_code": "A87.0", "acceptable_codes": "", "acceptable_family3": "A87", "strict_match": true, "acceptable_match": true, "family3_match": true, "symptom_not_allowed": false, "raw_pool_size": 6, "filtered_pool_size": 0, "topk_size": 0, "evidence_count": 1, "review_reason_tag": "other", "top1_score": 0, "top2_score": 0, "delta_top1_top2": 0, "top3_codes": "", "top3_terms": ""}
|
||||
{"case_id": "106_23056475", "document_type": "trackare", "chosen_code": "I26.9", "chosen_term": "Embolie pulmonaire", "verdict": "REVIEW", "confidence": "medium", "expected_code": "I26.9", "acceptable_codes": "I26.0|I26.9", "acceptable_family3": "I26", "strict_match": true, "acceptable_match": true, "family3_match": true, "symptom_not_allowed": false, "raw_pool_size": 10, "filtered_pool_size": 7, "topk_size": 7, "evidence_count": 1, "review_reason_tag": "low_delta", "top1_score": 6.0, "top2_score": 5.0, "delta_top1_top2": 1.0, "top3_codes": "I26.9|I26.9|Q53.9", "top3_terms": "Embolie pulmonaire|Embolie pulmonaire|Cryptorchidie"}
|
||||
{"case_id": "73_23139637", "document_type": "trackare", "chosen_code": "R06.0", "chosen_term": "Dyspnée", "verdict": "REVIEW", "confidence": "medium", "expected_code": "R06.0", "acceptable_codes": "", "acceptable_family3": "R06", "strict_match": true, "acceptable_match": true, "family3_match": true, "symptom_not_allowed": false, "raw_pool_size": 1, "filtered_pool_size": 1, "topk_size": 1, "evidence_count": 1, "review_reason_tag": "mono_fragile", "top1_score": 1.0, "top2_score": 0, "delta_top1_top2": 1.0, "top3_codes": "R06.0", "top3_terms": "Dyspnée"}
|
||||
15
docs/gold_debug/NUKE3_GOLD_TOP_ERRORS.md
Normal file
15
docs/gold_debug/NUKE3_GOLD_TOP_ERRORS.md
Normal file
@@ -0,0 +1,15 @@
|
||||
# NUKE-3 — Top erreurs gold CRH
|
||||
|
||||
**Date** : 2026-02-24 14:34
|
||||
**Cas** : 5
|
||||
|
||||
| # | Case ID | Choisi | Attendu | Strict | Accept. | Verdict | Conf. | Delta | Reason |
|
||||
|---|---------|--------|---------|--------|---------|---------|-------|-------|--------|
|
||||
| 1 | 132_23080179 | R59.0 | C83.3 | FAIL | FAIL | REVIEW | medium | 0 | other |
|
||||
| 2 | 74_23141536 | D50 | I25.1 | FAIL | FAIL | REVIEW | medium | 0.0 | low_delta |
|
||||
| 3 | 115_23066188 | A87.0 | A87.0 | OK | OK | CONFIRMED | high | 0 | other |
|
||||
| 4 | 106_23056475 | I26.9 | I26.9 | OK | OK | REVIEW | medium | 1.0 | low_delta |
|
||||
| 5 | 73_23139637 | R06.0 | R06.0 | OK | OK | REVIEW | medium | 1.0 | mono_fragile |
|
||||
|
||||
---
|
||||
*Généré le 2026-02-24 14:34*
|
||||
40
docs/gold_debug/case_115_23066188.json
Normal file
40
docs/gold_debug/case_115_23066188.json
Normal file
@@ -0,0 +1,40 @@
|
||||
{
|
||||
"case_id": "115_23066188",
|
||||
"document_type": "trackare",
|
||||
"gold": {
|
||||
"dp_expected": {
|
||||
"code": "A87.0",
|
||||
"label": "Méningite à entérovirus"
|
||||
},
|
||||
"dp_acceptable_codes": [],
|
||||
"dp_acceptable_family3": [
|
||||
"A87"
|
||||
],
|
||||
"allow_symptom_dp": false,
|
||||
"confidence": "probable"
|
||||
},
|
||||
"prediction": {
|
||||
"chosen_code": "A87.0",
|
||||
"chosen_term": "Méningite à entérovirus",
|
||||
"verdict": "CONFIRMED",
|
||||
"confidence": "high",
|
||||
"reason": "DP Trackare — source d'autorité",
|
||||
"review_reason_tag": "other",
|
||||
"evidence": [
|
||||
"Source: Trackare (codage établissement)"
|
||||
],
|
||||
"evidence_count": 1
|
||||
},
|
||||
"pool_stats": {
|
||||
"raw_pool_size": 6,
|
||||
"filtered_pool_size": 0,
|
||||
"topk_size": 0
|
||||
},
|
||||
"top_candidates": [],
|
||||
"match_eval": {
|
||||
"strict_match": true,
|
||||
"acceptable_match": true,
|
||||
"family3_match": true,
|
||||
"symptom_not_allowed": false
|
||||
}
|
||||
}
|
||||
39
docs/gold_debug/case_115_23066188.md
Normal file
39
docs/gold_debug/case_115_23066188.md
Normal file
@@ -0,0 +1,39 @@
|
||||
# Case Debug — 115_23066188
|
||||
|
||||
**Type** : trackare
|
||||
**Verdict** : CONFIRMED
|
||||
**Confidence** : high
|
||||
**Code choisi** : A87.0
|
||||
**Reason** : DP Trackare — source d'autorité
|
||||
**Evidence** : 1 extrait(s)
|
||||
**Pool** : 6 raw → 0 candidats
|
||||
**DP attendu** : A87.0 (Méningite à entérovirus)
|
||||
**Confiance gold** : probable
|
||||
**Match** : strict=OK, acceptable=OK, symptôme interdit=-
|
||||
|
||||
## Gold vs Prediction
|
||||
|
||||
| | Gold | NUKE-3 |
|
||||
|---|------|--------|
|
||||
| Code | A87.0 | A87.0 |
|
||||
| Label | Méningite à entérovirus | Méningite à entérovirus |
|
||||
| Codes acceptables | - | - |
|
||||
| Family3 | A87 | - |
|
||||
| Confiance | probable | high |
|
||||
| Symptôme autorisé | non | - |
|
||||
|
||||
## Top candidats
|
||||
|
||||
| Rank | Code | Score | Term | Flags | Section |
|
||||
|------|------|-------|------|-------|---------|
|
||||
|
||||
## Evidence
|
||||
|
||||
1. Source: Trackare (codage établissement)
|
||||
|
||||
## Hypothèse bug
|
||||
|
||||
**Pool vide** — aucun candidat DP n'a été extrait. Vérifier l'extraction CIM-10 sur ce document.
|
||||
|
||||
---
|
||||
*Généré le 2026-02-24 14:00*
|
||||
41
docs/gold_debug/case_132_23080179.json
Normal file
41
docs/gold_debug/case_132_23080179.json
Normal file
@@ -0,0 +1,41 @@
|
||||
{
|
||||
"case_id": "132_23080179",
|
||||
"document_type": "trackare",
|
||||
"gold": {
|
||||
"dp_expected": {
|
||||
"code": "C83.3",
|
||||
"label": "Lymphome diffus à grandes cellules B"
|
||||
},
|
||||
"dp_acceptable_codes": [],
|
||||
"dp_acceptable_family3": [
|
||||
"C83"
|
||||
],
|
||||
"allow_symptom_dp": false,
|
||||
"confidence": "probable"
|
||||
},
|
||||
"prediction": {
|
||||
"chosen_code": "R59.0",
|
||||
"chosen_term": "Adénopathie",
|
||||
"verdict": "REVIEW",
|
||||
"confidence": "medium",
|
||||
"reason": "Trackare symptôme vs CRH diagnostic — vérification DIM requise",
|
||||
"review_reason_tag": "other",
|
||||
"evidence": [
|
||||
"Source: Trackare (codage établissement)",
|
||||
"Alerte: Trackare code un symptôme (R*) mais le CRH mentionne un diagnostic étiologique"
|
||||
],
|
||||
"evidence_count": 2
|
||||
},
|
||||
"pool_stats": {
|
||||
"raw_pool_size": 23,
|
||||
"filtered_pool_size": 0,
|
||||
"topk_size": 0
|
||||
},
|
||||
"top_candidates": [],
|
||||
"match_eval": {
|
||||
"strict_match": false,
|
||||
"acceptable_match": false,
|
||||
"family3_match": false,
|
||||
"symptom_not_allowed": true
|
||||
}
|
||||
}
|
||||
40
docs/gold_debug/case_132_23080179.md
Normal file
40
docs/gold_debug/case_132_23080179.md
Normal file
@@ -0,0 +1,40 @@
|
||||
# Case Debug — 132_23080179
|
||||
|
||||
**Type** : trackare
|
||||
**Verdict** : REVIEW
|
||||
**Confidence** : medium
|
||||
**Code choisi** : R59.0
|
||||
**Reason** : Trackare symptôme vs CRH diagnostic — vérification DIM requise
|
||||
**Evidence** : 2 extrait(s)
|
||||
**Pool** : 23 raw → 0 candidats
|
||||
**DP attendu** : C83.3 (Lymphome diffus à grandes cellules B)
|
||||
**Confiance gold** : probable
|
||||
**Match** : strict=FAIL, acceptable=FAIL, symptôme interdit=OUI
|
||||
|
||||
## Gold vs Prediction
|
||||
|
||||
| | Gold | NUKE-3 |
|
||||
|---|------|--------|
|
||||
| Code | C83.3 | R59.0 |
|
||||
| Label | Lymphome diffus à grandes cellules B | Adénopathie |
|
||||
| Codes acceptables | - | - |
|
||||
| Family3 | C83 | - |
|
||||
| Confiance | probable | medium |
|
||||
| Symptôme autorisé | non | - |
|
||||
|
||||
## Top candidats
|
||||
|
||||
| Rank | Code | Score | Term | Flags | Section |
|
||||
|------|------|-------|------|-------|---------|
|
||||
|
||||
## Evidence
|
||||
|
||||
1. Source: Trackare (codage établissement)
|
||||
2. Alerte: Trackare code un symptôme (R*) mais le CRH mentionne un diagnostic étiologique
|
||||
|
||||
## Hypothèse bug
|
||||
|
||||
**Pool vide** — aucun candidat DP n'a été extrait. Vérifier l'extraction CIM-10 sur ce document.
|
||||
|
||||
---
|
||||
*Généré le 2026-02-24 14:33*
|
||||
Reference in New Issue
Block a user