- Sélection et copie de 27 documents représentatifs (10 simples, 12 moyens, 5 complexes) - Outil d'annotation CLI complet (tools/annotation_tool.py) - Guide d'annotation détaillé (docs/annotation_guide.md) - Évaluateur de qualité (evaluation/quality_evaluator.py) * Calcul Précision, Rappel, F1-Score * Identification faux positifs/négatifs * Métriques par type de PII * Export JSON et rapports texte - Scanner de fuite (evaluation/leak_scanner.py) * Détection PII résiduels (CRITIQUE) * Détection nouveaux PII (HAUTE) * Scan métadonnées PDF (MOYENNE) - Benchmark de performance (evaluation/benchmark.py) * Mesure temps de traitement * Mesure CPU/RAM * Export JSON/CSV - Tests unitaires complets pour tous les composants - Documentation complète du module d'évaluation Tâches complétées: - 1.1.1 Sélection de 27 documents (au lieu de 30) - 1.1.2 Outil d'annotation CLI - 1.2.1 Évaluateur de qualité - 1.2.2 Scanner de fuite - 1.2.3 Benchmark de performance Prochaines étapes: - 1.1.3 Annotation des 27 documents (manuel) - 1.1.4 Enrichissement stopwords médicaux - 1.3 Mesure de la baseline
33 lines
3.2 KiB
JSON
33 lines
3.2 KiB
JSON
{"page": -1, "kind": "OCR_USED", "original": "docTR", "placeholder": "", "bbox_hint": null}
|
|
{"page": 0, "kind": "OGC", "original": "8", "placeholder": "[OGC]", "bbox_hint": null}
|
|
{"page": 0, "kind": "force_term", "original": "CENTRE HOSPITALIER COTE BASQUE", "placeholder": "[MASK]", "bbox_hint": null}
|
|
{"page": 0, "kind": "FINESS", "original": "640780417", "placeholder": "[FINESS]", "bbox_hint": null}
|
|
{"page": 0, "kind": "NOM", "original": "CMA", "placeholder": "[NOM]", "bbox_hint": null}
|
|
{"page": 1, "kind": "force_term", "original": "CONCERTATION", "placeholder": "[MASK]", "bbox_hint": null}
|
|
{"page": 1, "kind": "force_term", "original": "CENTRE HOSPITALIER COTE BASQUE", "placeholder": "[MASK]", "bbox_hint": null}
|
|
{"page": 1, "kind": "FINESS", "original": "640780417", "placeholder": "[FINESS]", "bbox_hint": null}
|
|
{"page": 1, "kind": "OGC", "original": "8", "placeholder": "[OGC]", "bbox_hint": null}
|
|
{"page": 1, "kind": "AGE", "original": "Patient de 70 ans", "placeholder": "[AGE]", "bbox_hint": null}
|
|
{"page": 1, "kind": "NOM", "original": "Gilles DE MONREDON", "placeholder": "[NOM]", "bbox_hint": null}
|
|
{"page": 1, "kind": "force_term", "original": "CONCERTATION", "placeholder": "[MASK]", "bbox_hint": null}
|
|
{"page": 1, "kind": "force_term", "original": "CONCERTATION", "placeholder": "[MASK]", "bbox_hint": null}
|
|
{"page": 2, "kind": "force_term", "original": "CONCERTATION", "placeholder": "[MASK]", "bbox_hint": null}
|
|
{"page": 2, "kind": "force_term", "original": "CONCERTATION", "placeholder": "[MASK]", "bbox_hint": null}
|
|
{"page": 2, "kind": "force_term", "original": "CENTRE HOSPITALIER COTE BASQUE", "placeholder": "[MASK]", "bbox_hint": null}
|
|
{"page": 2, "kind": "FINESS", "original": "640780417", "placeholder": "[FINESS]", "bbox_hint": null}
|
|
{"page": 2, "kind": "OGC", "original": "8", "placeholder": "[OGC]", "bbox_hint": null}
|
|
{"page": -1, "kind": "NOM_EXTRACTED", "original": "Atteste", "placeholder": "[NOM]", "bbox_hint": null}
|
|
{"page": -1, "kind": "NOM_EXTRACTED", "original": "FINESS", "placeholder": "[NOM]", "bbox_hint": null}
|
|
{"page": -1, "kind": "NOM_EXTRACTED", "original": "FINESS", "placeholder": "[NOM]", "bbox_hint": null}
|
|
{"page": -1, "kind": "NOM_EXTRACTED", "original": "FINESS", "placeholder": "[NOM]", "bbox_hint": null}
|
|
{"page": -1, "kind": "NOM_EXTRACTED", "original": "DAS", "placeholder": "[NOM]", "bbox_hint": null}
|
|
{"page": -1, "kind": "NOM_EXTRACTED", "original": "DAS", "placeholder": "[NOM]", "bbox_hint": null}
|
|
{"page": -1, "kind": "NOM_EXTRACTED", "original": "IGS", "placeholder": "[NOM]", "bbox_hint": null}
|
|
{"page": -1, "kind": "NOM_EXTRACTED", "original": "CMA", "placeholder": "[NOM]", "bbox_hint": null}
|
|
{"page": -1, "kind": "NOM_EXTRACTED", "original": "FSD", "placeholder": "[NOM]", "bbox_hint": null}
|
|
{"page": -1, "kind": "NOM_GLOBAL", "original": "FINESS", "placeholder": "[NOM]", "bbox_hint": null}
|
|
{"page": -1, "kind": "NOM_GLOBAL", "original": "MONREDON", "placeholder": "[NOM]", "bbox_hint": null}
|
|
{"page": -1, "kind": "NOM_GLOBAL", "original": "Gilles", "placeholder": "[NOM]", "bbox_hint": null}
|
|
{"page": -1, "kind": "NOM_GLOBAL", "original": "Atteste", "placeholder": "[NOM]", "bbox_hint": null}
|
|
{"page": -1, "kind": "NOM_GLOBAL", "original": "CMA", "placeholder": "[NOM]", "bbox_hint": null}
|