refactor: réorganisation référentiels, nouveaux modules extraction, nettoyage code obsolète
- Réorganisation data/referentiels/ : pdfs/, dicts/, user/ (structure unifiée) - Fix badges "Source absente" sur page admin référentiels - Ré-indexation COCOA 2025 (555 → 1451 chunks, couverture 94%) - Fix VRAM OOM : embeddings forcés CPU via T2A_EMBED_CPU - Nouveaux modules : document_router, docx_extractor, image_extractor, ocr_engine - Module complétude (quality/completude.py + config YAML) - Template DIM (synthèse dimensionnelle) - Gunicorn config + systemd service t2a-viewer - Suppression t2a_install_rag_cleanup/ (copie obsolète) - Suppression scripts/ et scripts_t2a_v2/ (anciens benchmarks) - Suppression 81 fichiers _doc.txt de test - Cache Ollama : TTL configurable, corrections loader YAML - Dashboard : améliorations templates (base, index, detail, cpam, validation) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -192,7 +192,7 @@ class TestSplitDocuments:
|
||||
# --- Test intégration process_pdf ---
|
||||
|
||||
class TestProcessPdfMulti:
|
||||
@patch("src.main.extract_text_with_pages")
|
||||
@patch("src.main.extract_document_with_pages")
|
||||
@patch("src.main.extract_medical_info")
|
||||
@patch("src.main._run_edsnlp", return_value=None)
|
||||
@patch("src.main._use_edsnlp", False)
|
||||
@@ -203,9 +203,14 @@ class TestProcessPdfMulti:
|
||||
from src.main import process_pdf
|
||||
from src.config import DossierMedical, Diagnostic
|
||||
from src.extraction.page_tracker import PageTracker
|
||||
from src.extraction.pdf_extractor import ExtractionStats
|
||||
|
||||
# Mock extract_text_with_pages retournant un texte multi-épisodes Trackare
|
||||
mock_extract.return_value = (TRACKARE_MULTI, PageTracker([(0, len(TRACKARE_MULTI))]))
|
||||
# Mock extract_document_with_pages retournant un texte multi-épisodes Trackare
|
||||
mock_extract.return_value = (
|
||||
TRACKARE_MULTI,
|
||||
PageTracker([(0, len(TRACKARE_MULTI))]),
|
||||
ExtractionStats(total_pages=1, chars_per_page=[len(TRACKARE_MULTI)], total_chars=len(TRACKARE_MULTI)),
|
||||
)
|
||||
|
||||
# Mock extract_medical_info retournant un DossierMedical minimal
|
||||
mock_medical.return_value = DossierMedical(
|
||||
|
||||
Reference in New Issue
Block a user