v1.0 - Version stable: multi-PC, détection UI-DETR-1, 3 modes exécution

- Frontend v4 accessible sur réseau local (192.168.1.40) - Ports ouverts: 3002 (frontend), 5001 (backend), 5004 (dashboard) - Ollama GPU fonctionnel - Self-healing interactif - Dashboard confiance Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-29 11:23:51 +01:00
parent 21bfa3b337
commit a27b74cf22
1595 changed files with 412691 additions and 400 deletions
--- a/QUICK_START.md
+++ b/QUICK_START.md
@@ -0,0 +1,163 @@
+# Quick Start - Détection UI Hybride
+
+## Installation
+
+### 1. Installer Ollama
+
+```bash
+# Linux
+curl -fsSL https://ollama.ai/install.sh | sh
+
+# macOS
+brew install ollama
+```
+
+### 2. Démarrer Ollama
+
+```bash
+ollama serve
+```
+
+### 3. Télécharger le modèle VLM
+
+```bash
+ollama pull qwen3-vl:8b
+```
+
+## Utilisation
+
+### Test Rapide
+
+```bash
+./rpa_vision_v3/test_quick.sh
+```
+
+### Utilisation Programmatique
+
+```python
+from rpa_vision_v3.core.detection import create_detector
+
+# Créer le détecteur
+detector = create_detector()
+
+# Détecter les éléments
+elements = detector.detect("screenshot.png")
+
+# Utiliser les résultats
+for elem in elements:
+    print(f"{elem.type:15s} | {elem.role:20s} | {elem.label}")
+```
+
+### Exemple Complet
+
+```python
+from rpa_vision_v3.core.detection import UIDetector, DetectionConfig
+
+# Configuration personnalisée
+config = DetectionConfig(
+    vlm_model="qwen3-vl:8b",
+    confidence_threshold=0.7,
+    min_region_size=10,
+    max_region_size=600,
+    use_vlm_classification=True
+)
+
+# Créer le détecteur
+detector = UIDetector(config)
+
+# Détecter
+elements = detector.detect("screenshot.png", window_context={
+    "title": "My Application",
+    "process": "myapp"
+})
+
+# Filtrer par type
+buttons = [e for e in elements if e.type == "button"]
+text_inputs = [e for e in elements if e.type == "text_input"]
+
+print(f"Trouvé {len(buttons)} boutons et {len(text_inputs)} champs de texte")
+```
+
+## Tests Disponibles
+
+```bash
+# Test complet avec validation
+python3 rpa_vision_v3/examples/test_complete_real.py
+
+# Test hybride basique
+python3 rpa_vision_v3/examples/test_hybrid_detection.py screenshot.png
+
+# Test VLM simple
+python3 rpa_vision_v3/examples/test_real_vlm_detection.py
+```
+
+## Performance
+
+- **Détection OpenCV:** ~10ms
+- **Classification VLM:** ~1-2s par élément
+- **Total:** ~30-60s pour 20-50 éléments
+
+## Types d'Éléments Détectés
+
+- `button` - Boutons
+- `text_input` - Champs de texte
+- `checkbox` - Cases à cocher
+- `radio` - Boutons radio
+- `dropdown` - Listes déroulantes
+- `tab` - Onglets
+- `link` - Liens
+- `icon` - Icônes
+- `menu_item` - Éléments de menu
+
+## Rôles Sémantiques
+
+- `primary_action` - Action principale
+- `cancel` - Annulation
+- `submit` - Soumission
+- `form_input` - Saisie de formulaire
+- `search_field` - Champ de recherche
+- `navigation` - Navigation
+- `settings` - Paramètres
+- `close` - Fermeture
+
+## Troubleshooting
+
+### Ollama non disponible
+
+```bash
+# Vérifier le service
+systemctl status ollama  # Linux
+brew services list  # macOS
+
+# Redémarrer
+ollama serve
+```
+
+### Modèle non trouvé
+
+```bash
+ollama list
+ollama pull qwen3-vl:8b
+```
+
+### Détection lente
+
+- Réduire `max_elements` dans la config
+- Utiliser un modèle plus rapide (granite3.2-vision:2b)
+- Augmenter `confidence_threshold` pour filtrer plus
+
+### Peu d'éléments détectés
+
+- Baisser `confidence_threshold` (ex: 0.5)
+- Réduire `min_region_size` (ex: 10)
+- Augmenter `max_region_size` (ex: 600)
+
+## Documentation
+
+- [Résumé d'implémentation](HYBRID_DETECTION_SUMMARY.md)
+- [Intégration Ollama](docs/OLLAMA_INTEGRATION.md)
+- [Architecture complète](docs/specs/design.md)
+
+## Support
+
+Pour plus d'aide, consultez les exemples dans `rpa_vision_v3/examples/`