f4a23a5f43e57f75cd473462b8fb9357a10bb2bd
P0-A: stop words français + seuil subparts 5 chars + sweep conditionnel P0-B: 6 nouveaux patterns PHI (DDN, Par, N Ipp, Adresse, DEMANDE, venue) P2-C: cohérence pseudonymes (_find_matching_entity) + fix crochets P1-B: text_cleaner.py — sidebar OCR, footers, dédup vitales, collapse blanks P1-A: dédup CRH par SequenceMatcher (seuil 85%) Tests: 34 nouveaux tests (996 pass, 0 fail) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Description
No description provided
Languages
Python
87.3%
HTML
12.3%
Shell
0.4%