v1.0 - Version stable: multi-PC, détection UI-DETR-1, 3 modes exécution

- Frontend v4 accessible sur réseau local (192.168.1.40) - Ports ouverts: 3002 (frontend), 5001 (backend), 5004 (dashboard) - Ollama GPU fonctionnel - Self-healing interactif - Dashboard confiance Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-29 11:23:51 +01:00
parent 21bfa3b337
commit a27b74cf22
1595 changed files with 412691 additions and 400 deletions
--- a/docs/SESSION_25JAN2026_ETAT_AVANT_DEMOS.md
+++ b/docs/SESSION_25JAN2026_ETAT_AVANT_DEMOS.md
@@ -0,0 +1,160 @@
+# Session 25 Janvier 2026 - État Avant Période Démos
+
+> **IMPORTANT** : Ne rien modifier pendant la période de démos !
+
+---
+
+## 1. SERVICES VWB - ÉTAT ACTUEL
+
+| Service | Port | Status |
+|---------|------|--------|
+| Backend VWB | 5001 | ✅ Healthy |
+| Frontend VWB | 3000 | ✅ Running |
+| Ollama | 11434 | ✅ Disponible |
+| GPU RTX 5070 | - | ✅ Fonctionnel |
+
+### Commandes de démarrage (si nécessaire après reboot)
+
+```bash
+# Backend VWB (IMPORTANT: utiliser le venv dans backend/ qui contient rfdetr)
+cd /home/dom/ai/rpa_vision_v3/visual_workflow_builder/backend
+source venv/bin/activate   # <-- venv dans backend/, pas celui du parent !
+PORT=5001 FLASK_ENV=development python app.py
+
+# Précharger le modèle UI-DETR-1 (optionnel mais recommandé)
+curl -X POST http://localhost:5001/api/ui-detection/preload
+
+# Frontend VWB v4 (dernière version)
+cd /home/dom/ai/rpa_vision_v3/visual_workflow_builder/frontend_v4
+npm run dev   # Port 3002 ou 3003 si occupé
+
+# Vérifier services
+cd /home/dom/ai/rpa_vision_v3/visual_workflow_builder
+./status.sh
+```
+
+---
+
+## 2. ATTENTION - VENVS MULTIPLES
+
+**IMPORTANT** : Il y a plusieurs venvs dans le projet VWB :
+
+| Venv | Contenu | Usage |
+|------|---------|-------|
+| `visual_workflow_builder/venv/` | Dépendances de base | ❌ Ne contient PAS rfdetr |
+| `visual_workflow_builder/backend/venv/` | rfdetr, torch, etc. | ✅ **Utiliser celui-ci pour le backend** |
+
+Si la détection UI ne fonctionne pas, vérifier que le backend utilise le bon venv !
+
+---
+
+## 3. OLLAMA - VÉRIFICATION GPU
+
+**Test effectué le 25/01/2026 :**
+
+```
+NAME                ID              SIZE      PROCESSOR    CONTEXT
+moondream:latest    55fc3abd3867    1.9 GB    100% GPU     4096
+```
+
+**Utilisation VRAM :**
+- Avant chargement : 594 MiB
+- Après chargement moondream : 3335 MiB / 12227 MiB
+
+**Conclusion** : Ollama utilise bien le GPU RTX 5070.
+
+---
+
+## 4. CE QUI FONCTIONNE (Documentation 24 janvier)
+
+### Pipeline de Matching Vision
+- ✅ UI-DETR-1 (détection éléments UI)
+- ✅ CLIP (similarité sémantique)
+- ✅ Template Matching (fallback OpenCV)
+- ✅ Static Fallback (coordonnées originales)
+
+### Seuils Optimisés
+```python
+MAX_DISTANCE_PX = 120      # Rejeter élément > 120px
+MIN_CLIP_SCORE = 0.55      # Score CLIP minimum
+MIN_COMBINED_SCORE = 0.5   # Score combiné minimum
+MAX_TEMPLATE_DISTANCE = 150
+```
+
+### Fonctionnalités VWB
+- ✅ Modes Basique / Intelligent / Debug
+- ✅ Intégration SeeClick (grounding)
+- ✅ Self-healing interactif
+- ✅ Dashboard confiance temps réel
+- ✅ Workflow "OnlyOffice" 12 étapes validé
+
+---
+
+## 5. PROPOSITIONS D'AMÉLIORATION (À FAIRE APRÈS DÉMOS)
+
+### Priorité Haute
+1. **Cache des modèles** : Charger UI-DETR-1 et CLIP une seule fois au démarrage
+   - Fichier concerné : `services/intelligent_executor.py`
+   - Impact : Réduction significative du temps d'exécution
+
+### Priorité Moyenne
+2. **Mode Hybride Intelligent** : Basic par défaut, Vision seulement si échec
+   - Gain de performance pour workflows stables
+
+3. **Vérification Post-Action** : Comparer screenshots avant/après
+   - Détection automatique des échecs d'action
+
+### Priorité Basse
+4. **Nettoyage Code Legacy** : Vérifier les hacks temporaires (RESUME_DEBUG_22JAN2026.md)
+   - Visual Search possiblement désactivé temporairement
+
+5. **Dockerisation** : Conteneuriser le système pour faciliter déploiement/maintenance
+   - Voir document détaillé : `docs/ROADMAP_DOCKERISATION.md`
+   - Phase 1 : Services web (facile)
+   - Phase 2 : Backend GPU (moyen)
+   - Phase 3 : Agent RPA (complexe)
+
+---
+
+## 6. ROADMAP (Depuis VISION_RPA_INTELLIGENT.md)
+
+### Fait ✅
+- [x] Frontend VWB v4 avec React Flow
+- [x] Toggle Mode Basique/Intelligent/Debug
+- [x] Intégration UI-DETR-1 pour détection
+- [x] Overlay Debug (affichage bboxes)
+- [x] Exécution intelligente (template matching)
+- [x] Sélection zone de détection sur capture fixe
+- [x] Intégration SeeClick en fallback (24 janvier)
+- [x] Self-healing interactif (24 janvier)
+- [x] Dashboard confiance (24 janvier)
+
+### À faire (après démos)
+- [ ] Export données d'apprentissage (format JSON)
+- [ ] Apprentissage des corrections (feedback loop)
+- [ ] Connexion au moteur principal (agents autonomes)
+
+---
+
+## 7. FICHIERS CLÉS
+
+| Fichier | Rôle |
+|---------|------|
+| `backend/app.py` | Point d'entrée backend VWB |
+| `backend/catalog_routes.py` | Routes catalogue + exécution |
+| `backend/catalog_routes_v2_vlm.py` | Intégration VLM Ollama |
+| `backend/services/intelligent_executor.py` | Pipeline vision CLIP + Template |
+| `frontend/src/components/VisualSelector/index.tsx` | Sélection ancres visuelles |
+
+---
+
+## 8. BACKUP
+
+Backup créé le 25/01/2026 :
+```
+/home/dom/ai/backups/rpa_vision_v3_backup_25jan2026.tar.gz
+```
+
+---
+
+*Document généré le 25 janvier 2026 - NE PAS MODIFIER PENDANT DÉMOS*