t2a_v2

Dom/t2a_v2

Author	SHA1	Message	Date
dom	2578afb6ff	chore: add .gitignore	2026-03-05 00:37:41 +01:00
dom	63354e75bc	tests: dp_finalizer — 20 tests R1-R5 + pass-through + quality_flags + sérialisation - TestR1CrhConfirmedOverridesTrackare (2 tests : override + cohérent) - TestR2TrackareCorroborated (2 tests : exact + family3) - TestR3TrackareSymptom (3 tests : override, review prudent, evidence faible) - TestR4Ambiguous (1 test) - TestR5Interdictions (4 tests : Z-code, Z-whitelist, R-code, allow_symptom) - TestPassThrough (3 tests : CRH-only, Trackare-only, aucun DP) - TestFinalizeDp (5 tests : flags merge, alertes append, sources set, sérialisation) 1063 tests passent, 0 régression. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-24 17:50:32 +01:00
dom	cad0dd22b1	tests: alias DLBCL + garde-fou Trackare + e2e PDFs réels + gold CRH + benchmark enrichi - 11 tests unitaires : TestAliasAndConclusionBonus (7) + TestTrackareSymptomGuard (4) - Tests e2e sur PDFs réels (skip si absent) : méningite A87.0 + DLBCL C83.3 top1 - Gold CRH enrichi : 5 cas (2 réels ajoutés : 115_23066188, 132_23080179) - Benchmark synthese : récupération conclusion depuis source_excerpt des DAS/traitements - .gitignore : protection anti-PHI (real_crh_pdfs/, data/crh_samples/*.pdf) - docs/PHI_POLICY.md : 7 règles de sécurité PHI - Rapports debug : case 132 REVIEW (garde-fou actif), top errors, DIM pack 1043 tests passent, 0 régression. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-24 14:35:57 +01:00
dom	909e051cc9	feat: architecture multi-modèles LLM + quality engine + benchmark - Multi-modèles : 4 rôles LLM (coding=gemma3:27b-cloud, cpam=gemma3:27b-cloud, validation=deepseek-v3.2:cloud, qc=gemma3:12b) avec get_model(role) - Prompts externalisés : 7 templates dans src/prompts/templates.py - Cache Ollama : modèle stocké par entrée (migration auto ancien format) - call_ollama() : paramètre role= (priorité: model > role > global) - Quality engine : veto_engine + decision_engine + rules_router (YAML) - Benchmark qualité : scripts/benchmark_quality.py (A/B, métriques CIM-10) - Fix biologie : valeurs qualitatives (troponine négative) non filtrées - Fix CPAM : gemma3:27b-cloud au lieu de deepseek (JSON tronqué par thinking) - CPAM max_tokens 4000→6000, viewer admin multi-modèles - Benchmark 10 dossiers : 100% DAS valides, 10/10 CPAM, 243s/dossier Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-20 00:21:09 +01:00

4 Commits