feat(extract): normaliser ghs_injustifie en 0/1 (P2)

Qwen renvoie typiquement le libellé complet `0 SE 1 2 3 4 ATU FFM FSD`
dans le champ ghs_injustifie alors qu'une seule valeur 0/1 est attendue.
Ajout de `pipeline.checkboxes.parse_ghs_injustifie` qui extrait le
premier chiffre 0/1 via regex, ou "" si illisible.

Post-traitement appliqué à chaque extraction recueil et aux 18 JSONs
V2 existants (10 fichiers corrigés en place — les 8 autres avaient
déjà ghs_injustifie absent ou vide).

Note sur les 7 cases SE1-4/ATU/FFM/FSD : zones trop petites pour être
calibrées à l'œil et aucun cas positif (`ghs_injustifie=1`) dans
l'échantillon 2018 pour valider visuellement. La détection est en
placeholder, à recalibrer sur un cas positif réel.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Dom
2026-04-24 15:54:16 +02:00
parent 7dc3eba1fc
commit 7d45018139
12 changed files with 152 additions and 14 deletions

View File

@@ -282,7 +282,7 @@
"ghm_reco": "06M094",
"ghs_reco": "2161",
"recodage_impactant": "1",
"ghs_injustifie": "0 SE 1 2 3 4 ATU FFM FSD",
"ghs_injustifie": "0",
"praticien_conseil": "DR JP VIGNAU",
"accord_desaccord": "accord",
"_checkbox_debug": {

View File

@@ -317,7 +317,7 @@
"ghm_reco": "10M183",
"ghs_reco": "3969",
"recodage_impactant": "1",
"ghs_injustifie": "0 SE 1 2 3 4 ATU FFM FSD",
"ghs_injustifie": "0",
"praticien_conseil": "DR VIGNAU",
"accord_desaccord": "désaccord",
"_checkbox_debug": {

View File

@@ -215,7 +215,7 @@
"ghm_reco": "06C042",
"ghs_reco": "1940",
"recodage_impactant": "1",
"ghs_injustifie": "0 SE 1 2 3 4 ATU FFM FSD",
"ghs_injustifie": "0",
"praticien_conseil": "DR VIGNAÚ",
"accord_desaccord": "désaccord",
"_checkbox_debug": {

View File

@@ -304,7 +304,7 @@
"ghm_reco": "01C061",
"ghs_reco": "34",
"recodage_impactant": "1",
"ghs_injustifie": "0 SE 1 2 3 4 ATU FFM FSD",
"ghs_injustifie": "0",
"praticien_conseil": "",
"accord_desaccord": "désaccord",
"_checkbox_debug": {

View File

@@ -330,7 +330,7 @@
"ghm_reco": "03M112",
"ghs_reco": "861",
"recodage_impactant": "1",
"ghs_injustifie": "0 SE 1 2 3 4 ATU FFM FSD",
"ghs_injustifie": "0",
"praticien_conseil": "DR VIGNAU",
"accord_desaccord": "accord",
"_checkbox_debug": {

View File

@@ -371,7 +371,7 @@
"ghm_reco": "04M093",
"ghs_reco": "1163",
"recodage_impactant": "1",
"ghs_injustifie": "0 SE 1 2 3 4 ATU FFM FSD",
"ghs_injustifie": "0",
"praticien_conseil": "DR VIGNAU",
"accord_desaccord": "désaccord",
"_checkbox_debug": {

View File

@@ -298,7 +298,7 @@
"ghm_reco": "1947",
"ghs_reco": "06C071",
"recodage_impactant": "1",
"ghs_injustifie": "SE 1 2 3 4 ATU FFM FSD",
"ghs_injustifie": "",
"praticien_conseil": "DR VIGNAU",
"accord_desaccord": "accord",
"_checkbox_debug": {

View File

@@ -324,7 +324,7 @@
"ghm_reco": "23Z02Z",
"ghs_reco": "7992",
"recodage_impactant": "1",
"ghs_injustifie": "0 SE 1 2 3 4 ATU FFM FSD",
"ghs_injustifie": "0",
"praticien_conseil": "DR VIGNAU",
"accord_desaccord": "accord",
"_checkbox_debug": {

View File

@@ -298,7 +298,7 @@
"ghm_reco": "04M092",
"ghs_reco": "1162",
"recodage_impactant": "1",
"ghs_injustifie": "0 SE 1 2 3 4 ATU FFM FSD",
"ghs_injustifie": "0",
"praticien_conseil": "DR VIGNAU",
"accord_desaccord": "désaccord",
"_checkbox_debug": {

View File

@@ -201,7 +201,7 @@
"ghm_reco": "23Z02Z",
"ghs_reco": "7992",
"recodage_impactant": "1",
"ghs_injustifie": "SE 1 2 3 4 ATU FFM FSD",
"ghs_injustifie": "",
"praticien_conseil": "DR VIGNAU",
"accord_desaccord": "accord",
"_checkbox_debug": {