refactor: réorganisation référentiels, nouveaux modules extraction, nettoyage code obsolète

- Réorganisation data/referentiels/ : pdfs/, dicts/, user/ (structure unifiée) - Fix badges "Source absente" sur page admin référentiels - Ré-indexation COCOA 2025 (555 → 1451 chunks, couverture 94%) - Fix VRAM OOM : embeddings forcés CPU via T2A_EMBED_CPU - Nouveaux modules : document_router, docx_extractor, image_extractor, ocr_engine - Module complétude (quality/completude.py + config YAML) - Template DIM (synthèse dimensionnelle) - Gunicorn config + systemd service t2a-viewer - Suppression t2a_install_rag_cleanup/ (copie obsolète) - Suppression scripts/ et scripts_t2a_v2/ (anciens benchmarks) - Suppression 81 fichiers _doc.txt de test - Cache Ollama : TTL configurable, corrections loader YAML - Dashboard : améliorations templates (base, index, detail, cpam, validation) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-07 16:48:10 +01:00
parent 2578afb6ff
commit 4e2b4bd946
210 changed files with 6939 additions and 22104 deletions
--- a/src/medical/rag_search.py
+++ b/src/medical/rag_search.py
@@ -3,6 +3,7 @@
 from __future__ import annotations

 import logging
+import os
 import threading
 from concurrent.futures import ThreadPoolExecutor, as_completed

@@ -56,7 +57,7 @@ def _get_embed_model():
            raise RuntimeError("Modèle d'embedding indisponible (échec précédent)")
        from sentence_transformers import SentenceTransformer
        import torch
-        _device = "cuda" if torch.cuda.is_available() else "cpu"
+        _device = "cpu" if os.environ.get("T2A_EMBED_CPU") else ("cuda" if torch.cuda.is_available() else "cpu")
        _model_kwargs = {"low_cpu_mem_usage": False}
        try:
            logger.info("Chargement du modèle d'embedding (%s)...", _device)