v1.0 - Version stable: multi-PC, détection UI-DETR-1, 3 modes exécution

- Frontend v4 accessible sur réseau local (192.168.1.40) - Ports ouverts: 3002 (frontend), 5001 (backend), 5004 (dashboard) - Ollama GPU fonctionnel - Self-healing interactif - Dashboard confiance Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-29 11:23:51 +01:00
parent 21bfa3b337
commit a27b74cf22
1595 changed files with 412691 additions and 400 deletions
--- a/README.md
+++ b/README.md
@@ -0,0 +1,207 @@
+# RPA Vision V3 - 100% Vision-Based Workflow Automation
+
+## 📊 Status
+
+🚀 **PRODUCTION-READY** - Phase 12 Complete (77% System Completion) ✅
+
+**Latest Update**: 14 Décembre 2024
+- ✅ **10/13 Phases Complétées** - Système mature et fonctionnel
+- ✅ **Performance Exceptionnelle** - 500-6250x plus rapide que requis
+- ✅ **Architecture Entreprise** - 148k+ lignes, 19 modules, 6 specs complètes
+- ✅ **Innovations Techniques** - Self-healing, Multi-modal, GPU management
+- 📊 **Audit Complet** - [Rapport détaillé](AUDIT_COMPLET_SYSTEME_RPA_VISION_V3.md)
+
+**Quick Test**: `bash test_clip.sh`
+
+## 🎯 Vision
+
+RPA basé sur la **compréhension sémantique** des interfaces, pas sur des coordonnées de clics.
+
+Le système apprend des workflows en observant l'utilisateur et les automatise de manière robuste grâce à une architecture en 5 couches.
+
+## 🏗️ Architecture en 5 Couches
+
+```
+RawSession (Couche 0)
+    ↓
+ScreenState (Couche 1) - 4 niveaux d'abstraction
+    ↓
+UIElement Detection (Couche 2) - Types + Rôles sémantiques
+    ↓
+State Embedding (Couche 3) - Fusion multi-modale
+    ↓
+Workflow Graph (Couche 4) - Nodes + Edges + Learning States
+```
+
+## 📁 Structure
+
+```
+rpa_vision_v3/
+├── core/
+│   ├── models/          # Couches 0-4 : Structures de données
+│   ├── capture/         # Couche 0 : Capture événements + screenshots
+│   ├── detection/       # Couche 2 : Détection UI sémantique
+│   ├── embedding/       # Couche 3 : Fusion multi-modale + FAISS
+│   ├── graph/           # Couche 4 : Construction + Matching + Exécution
+│   └── persistence/     # Sauvegarde/Chargement
+├── data/
+│   ├── sessions/        # RawSessions
+│   ├── screen_states/   # ScreenStates
+│   ├── embeddings/      # Vecteurs .npy
+│   ├── faiss_index/     # Index FAISS
+│   └── workflows/       # Workflow Graphs
+└── tests/               # Tests unitaires + intégration
+```
+
+## 🚀 Démarrage Rapide
+
+### Installation
+
+```bash
+# 1. Installer Ollama
+curl -fsSL https://ollama.ai/install.sh | sh  # Linux
+# ou
+brew install ollama  # macOS
+
+# 2. Démarrer Ollama
+ollama serve
+
+# 3. Télécharger le modèle VLM
+ollama pull qwen3-vl:8b
+
+# 4. Installer dépendances Python
+pip install -r requirements.txt
+```
+
+### Test Rapide
+
+```bash
+# Diagnostic système
+python3 rpa_vision_v3/examples/diagnostic_vlm.py
+
+# Test de détection
+./rpa_vision_v3/test_quick.sh
+```
+
+### Utilisation - Détection UI
+
+```python
+from rpa_vision_v3.core.detection import create_detector
+
+# Créer le détecteur
+detector = create_detector()
+
+# Détecter les éléments UI
+elements = detector.detect("screenshot.png")
+
+# Utiliser les résultats
+for elem in elements:
+    print(f"{elem.type:15s} | {elem.role:20s} | {elem.label}")
+```
+
+### Utilisation - Workflow (Phase 4 - À venir)
+
+```python
+from rpa_vision_v3.core.models import RawSession, ScreenState, Workflow
+from rpa_vision_v3.core.graph import GraphBuilder, NodeMatcher
+
+# 1. Capturer une session
+session = RawSession(...)
+# ... capturer événements et screenshots
+
+# 2. Construire workflow automatiquement
+builder = GraphBuilder(...)
+workflow = builder.build_from_session(session)
+
+# 3. Matcher état actuel
+matcher = NodeMatcher(...)
+current_state = ScreenState(...)
+match = matcher.match(current_state, workflow)
+
+# 4. Exécuter action
+if match:
+    edge = workflow.get_outgoing_edges(match.node.node_id)[0]
+    executor.execute_edge(edge, current_state)
+```
+
+## 📚 Documentation
+
+### Guides Principaux
+- **Quick Start** : `QUICK_START.md` - Démarrage rapide
+- **Prochaines Étapes** : `NEXT_STEPS.md` - Roadmap et Phase 4
+- **Phase 3 Complète** : `PHASE3_COMPLETE.md` - Résumé Phase 3
+
+### Documentation Technique
+- **Spec complète** : `.kiro/specs/workflow-graph-implementation/`
+- **Architecture** : `docs/reference/ARCHITECTURE_VISION_COMPLETE.md`
+- **Détection Hybride** : `HYBRID_DETECTION_SUMMARY.md`
+- **Intégration Ollama** : `docs/OLLAMA_INTEGRATION.md`
+
+## 🎓 Concepts Clés
+
+### RPA 100% Vision
+
+- ❌ Pas de coordonnées (x, y) fixes
+- ✅ Rôles sémantiques (primary_action, form_input, etc.)
+- ✅ Matching par similarité visuelle et textuelle
+- ✅ Robuste aux changements d'UI
+
+### Apprentissage Progressif
+
+```
+OBSERVATION (5+ exécutions)
+    ↓
+COACHING (10+ assistances, succès >90%)
+    ↓
+AUTO_CANDIDATE (20+ exécutions, succès >95%)
+    ↓
+AUTO_CONFIRMÉ (validation utilisateur)
+```
+
+### State Embedding
+
+Fusion multi-modale :
+- 50% Image (screenshot complet)
+- 30% Texte (texte détecté)
+- 10% Titre (fenêtre)
+- 10% UI (éléments détectés)
+
+## 🧪 Tests
+
+```bash
+# Tests unitaires
+pytest tests/unit/
+
+# Tests d'intégration
+pytest tests/integration/
+
+# Tests de performance
+pytest tests/performance/ --benchmark-only
+```
+
+## 📈 Roadmap - 77% Complété (10/13 Phases)
+
+### ✅ **Phases Complétées**
+- [x] **Phase 1-2** : Fondations + Embeddings FAISS ✅
+- [x] **Phase 4-6** : Détection UI + Workflow Graphs + Action Execution ✅  
+- [x] **Phase 7-8** : Learning System + Training System ✅
+- [x] **Phase 10-12** : GPU Management + Performance + Monitoring ✅
+
+### 🎯 **Phases Restantes**
+- [ ] **Phase 3** : Checkpoint Final (tests storage)
+- [ ] **Phase 9** : Visual Workflow Builder (90% → 100%)
+- [ ] **Phase 13** : Tests End-to-End + Documentation finale
+
+### 🚀 **Composants Production-Ready**
+- **Agent V0** : Capture cross-platform + Encryption ✅
+- **Server API** : Processing pipeline + Web dashboard ✅  
+- **Analytics System** : Monitoring + Insights + Reporting ✅
+- **Self-Healing** : Automatic adaptation + Recovery ✅
+
+## 🤝 Contribution
+
+Voir `.kiro/specs/workflow-graph-implementation/tasks.md` pour les tâches en cours.
+
+## 📄 Licence
+
+Propriétaire - Tous droits réservés