feat(phase2): Multi-signal NER — BDPM gazetteers, confiance EDS, safe patterns, GLiNER
Chantier 1: Intégration BDPM (5737 médicaments officiels) dans medication whitelist Chantier 2: Safe patterns contextuels (dosages mg/mL/cpr, formes pharma, même ligne) Chantier 3: Scores de confiance NER réels (edsnlp 0.20 ner_confidence_score) Chantier 4: GLiNER zero-shot (urchade/gliner_multi_pii-v1) en vote croisé Chantier 5: Scripts export silver annotations + fine-tuning CamemBERT-bio 0 fuite, 0 régression, -18 FP supplémentaires éliminés. Sécurité: GLiNER ne peut rejeter que si confiance NER < 0.70. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -64,6 +64,12 @@ class EdsPseudoManager:
|
||||
self._nlp = edsnlp.load(path)
|
||||
else:
|
||||
self._nlp = edsnlp.load(model_id_or_path)
|
||||
# Activer les scores de confiance NER (edsnlp >= 0.16)
|
||||
try:
|
||||
ner_pipe = self._nlp.get_pipe('ner')
|
||||
ner_pipe.compute_confidence_score = True
|
||||
except Exception:
|
||||
pass # versions plus anciennes sans support confiance
|
||||
self._loaded = True
|
||||
|
||||
def unload(self) -> None:
|
||||
@@ -100,12 +106,15 @@ class EdsPseudoManager:
|
||||
mapped = EDS_LABEL_MAP.get(label, None)
|
||||
if mapped is None:
|
||||
continue
|
||||
# Score de confiance réel si disponible (edsnlp >= 0.16)
|
||||
raw_score = getattr(ent._, 'ner_confidence_score', None)
|
||||
conf = raw_score if isinstance(raw_score, float) else 1.0
|
||||
ents.append({
|
||||
"entity_group": label,
|
||||
"word": ent.text,
|
||||
"start": ent.start_char,
|
||||
"end": ent.end_char,
|
||||
"score": 1.0, # edsnlp ne fournit pas de score de confiance
|
||||
"score": conf,
|
||||
"eds_mapped_key": mapped,
|
||||
})
|
||||
out.append(ents)
|
||||
|
||||
Reference in New Issue
Block a user