v1.0 - Version stable: multi-PC, détection UI-DETR-1, 3 modes exécution

- Frontend v4 accessible sur réseau local (192.168.1.40) - Ports ouverts: 3002 (frontend), 5001 (backend), 5004 (dashboard) - Ollama GPU fonctionnel - Self-healing interactif - Dashboard confiance Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-29 11:23:51 +01:00
parent 21bfa3b337
commit a27b74cf22
1595 changed files with 412691 additions and 400 deletions
--- a/docs/archive/misc/agent/AGENT_V0_ANALYSIS.md
+++ b/docs/archive/misc/agent/AGENT_V0_ANALYSIS.md
@@ -0,0 +1,418 @@
+# Agent V0 - Analyse et Validation ✅
+
+**Date:** 24 novembre 2025  
+**Status:** Architecture Validée - Recommandations Fournies
+
+## 📋 Vue d'Ensemble
+
+L'agent V0 est un **enregistreur d'interface cross-plateforme** léger qui capture les interactions utilisateur et les envoie au serveur RPA Vision V3 pour apprentissage.
+
+### Objectif
+Permettre aux formateurs (Windows/macOS/Linux) d'enregistrer leurs workflows sans avoir besoin du système complet RPA Vision V3.
+
+### Architecture
+```
+Agent V0 (Poste Formateur)
+    ↓ Capture
+    ├─ Clics souris
+    ├─ Combos clavier  
+    ├─ Scroll molette
+    ├─ Hover (immobilité)
+    └─ Screenshots (full/crop)
+    ↓ Package
+    RawSession JSON + Screenshots ZIP
+    ↓ Upload
+Serveur RPA Vision V3 (Linux)
+```
+
+## ✅ Points Forts
+
+### 1. **Compatibilité Parfaite avec RPA Vision V3**
+- ✅ Utilise le format `rawsession_v1` identique à `core/models/raw_session.py`
+- ✅ Structure Event/Screenshot/WindowContext compatible
+- ✅ Timestamps relatifs (`t` en secondes depuis début session)
+- ✅ Métadonnées complètes (environment, user, context)
+
+### 2. **Architecture Modulaire et Propre**
+```
+agent_v0/
+├── main.py              # Point d'entrée
+├── tray_ui.py           # Interface utilisateur (icône tray)
+├── raw_session.py       # Modèle de données
+├── event_captor.py      # Capture souris (clics, scroll, hover)
+├── key_captor.py        # Capture clavier (combos)
+├── screen_capturer.py   # Screenshots (mss)
+├── window_info.py       # Info fenêtre active (xdotool)
+├── storage.py           # Création ZIP
+├── uploader.py          # Upload serveur
+├── user_config.py       # Configuration JSON
+├── config.py            # Constantes
+└── logger_conf.py       # Logging rotatif
+```
+
+### 3. **Fonctionnalités Avancées**
+- ✅ **Hover detection** - Capture l'immobilité souris (infobulles)
+- ✅ **Screenshot modes** - Full screen ou crop autour du curseur
+- ✅ **Key combos** - Détection CTRL+C, ALT+F4, etc.
+- ✅ **Scroll tracking** - Molette souris avec delta
+- ✅ **Network save** - Copie automatique vers chemin réseau
+- ✅ **ZIP packaging** - Session complète en un fichier
+- ✅ **Logging rotatif** - Fichiers logs avec rotation 5MB
+
+### 4. **UX Excellente**
+- ✅ Icône tray (zone de notification)
+- ✅ Menu simple: Start/Stop/Open/Quit
+- ✅ Indicateur visuel (vert=actif, gris=inactif)
+- ✅ Configuration JSON éditable
+- ✅ Ouverture dossiers sessions/logs depuis le menu
+
+## ⚠️ Points d'Attention
+
+### 1. **Compatibilité Cross-Plateforme Limitée**
+
+**Problème:** `window_info.py` utilise `xdotool` (Linux uniquement)
+
+```python
+# window_info.py - LINUX ONLY
+def get_active_window_info():
+    title = _run_cmd(["xdotool", "getactivewindow", "getwindowname"])
+    pid_str = _run_cmd(["xdotool", "getactivewindow", "getwindowpid"])
+```
+
+**Impact:** Ne fonctionnera pas sur Windows/macOS
+
+**Solution:** Implémenter des backends spécifiques par OS:
+
+```python
+# window_info.py - CROSS-PLATFORM
+import sys
+import platform
+
+def get_active_window_info() -> Dict[str, str]:
+    system = platform.system()
+    
+    if system == "Linux":
+        return _get_window_info_linux()
+    elif system == "Windows":
+        return _get_window_info_windows()
+    elif system == "Darwin":  # macOS
+        return _get_window_info_macos()
+    else:
+        return {"title": "unknown_window", "app_name": "unknown_app"}
+
+def _get_window_info_linux():
+    # Code actuel avec xdotool
+    ...
+
+def _get_window_info_windows():
+    # Windows: pywin32 ou ctypes
+    import win32gui
+    import win32process
+    import psutil
+    
+    hwnd = win32gui.GetForegroundWindow()
+    title = win32gui.GetWindowText(hwnd)
+    _, pid = win32process.GetWindowThreadProcessId(hwnd)
+    app_name = psutil.Process(pid).name()
+    
+    return {"title": title, "app_name": app_name}
+
+def _get_window_info_macos():
+    # macOS: pyobjc ou AppleScript
+    from AppKit import NSWorkspace
+    
+    active_app = NSWorkspace.sharedWorkspace().activeApplication()
+    app_name = active_app['NSApplicationName']
+    # Pour le titre, utiliser AppleScript ou Accessibility API
+    
+    return {"title": "...", "app_name": app_name}
+```
+
+**Dépendances à ajouter:**
+```txt
+# requirements.txt
+pywin32>=306 ; sys_platform == 'win32'
+pyobjc-framework-Cocoa>=10.0 ; sys_platform == 'darwin'
+psutil>=5.9.0  # Pour Windows process info
+```
+
+### 2. **Sécurité et Permissions**
+
+**Problème:** Capture d'écran et monitoring clavier nécessitent des permissions
+
+**Windows:**
+- Pas de permissions spéciales nécessaires
+- Antivirus peut bloquer (ajouter exception)
+
+**macOS:**
+- ⚠️ Nécessite "Accessibility" permissions
+- ⚠️ Nécessite "Screen Recording" permissions
+- Demander à l'utilisateur d'activer dans System Preferences
+
+**Linux:**
+- Nécessite X11 (pas Wayland par défaut)
+- `xdotool` doit être installé
+
+**Recommandation:** Ajouter un check au démarrage:
+
+```python
+# permissions_check.py
+def check_permissions():
+    system = platform.system()
+    
+    if system == "Darwin":
+        # Vérifier Accessibility
+        from AppKit import NSWorkspace
+        # Tester si on peut lire la fenêtre active
+        try:
+            NSWorkspace.sharedWorkspace().activeApplication()
+        except:
+            show_macos_permissions_dialog()
+            return False
+    
+    elif system == "Linux":
+        # Vérifier xdotool
+        if not shutil.which("xdotool"):
+            show_linux_install_dialog()
+            return False
+    
+    return True
+```
+
+### 3. **Gestion des Données Sensibles**
+
+**Problème:** Les screenshots peuvent contenir des données sensibles
+
+**Recommandations:**
+1. **Chiffrement du ZIP** avant upload
+2. **Anonymisation optionnelle** des screenshots
+3. **Politique de rétention** claire (supprimer après X jours)
+4. **RGPD compliance** - Consentement utilisateur
+
+```python
+# storage.py - Ajout chiffrement
+import zipfile
+import pyminizip  # ou cryptography
+
+def create_session_zip_encrypted(session, password):
+    zip_path = create_session_zip(session)
+    encrypted_path = zip_path.replace('.zip', '_encrypted.zip')
+    
+    # Chiffrer avec AES-256
+    pyminizip.compress(
+        zip_path, 
+        None, 
+        encrypted_path, 
+        password, 
+        5  # compression level
+    )
+    
+    os.remove(zip_path)  # Supprimer non-chiffré
+    return encrypted_path
+```
+
+### 4. **Performance et Optimisation**
+
+**Problème:** Screenshots en PNG peuvent être volumineux
+
+**Recommandations:**
+1. **Compression JPEG** pour screenshots (qualité 85%)
+2. **Resize** automatique si > 1920x1080
+3. **Throttling** - Max 1 screenshot/seconde
+
+```python
+# screen_capturer.py - Optimisations
+from PIL import Image
+
+def capture_optimized(self, focus_pos):
+    # Capture existante
+    screenshot_id, relative_path = self.capture(focus_pos)
+    
+    # Optimiser l'image
+    img_path = os.path.join(self._get_session_shots_dir(), f"{screenshot_id}.png")
+    
+    with Image.open(img_path) as img:
+        # Resize si trop grand
+        if img.width > 1920 or img.height > 1080:
+            img.thumbnail((1920, 1080), Image.Resampling.LANCZOS)
+        
+        # Convertir en JPEG (plus léger)
+        jpg_path = img_path.replace('.png', '.jpg')
+        img.convert('RGB').save(jpg_path, 'JPEG', quality=85, optimize=True)
+        
+        # Supprimer PNG
+        os.remove(img_path)
+        
+        # Mettre à jour relative_path
+        relative_path = relative_path.replace('.png', '.jpg')
+    
+    return screenshot_id, relative_path
+```
+
+### 5. **Packaging et Distribution**
+
+**Problème:** Distribution aux formateurs
+
+**Solution:** Utiliser PyInstaller (déjà configuré avec `.spec`)
+
+```bash
+# build.sh
+#!/bin/bash
+pyinstaller agent_v0_tray.spec
+
+# Résultat:
+# dist/agent_v0_tray  (Linux)
+# dist/agent_v0_tray.exe  (Windows)
+# dist/agent_v0_tray.app  (macOS)
+```
+
+**Recommandations:**
+1. **Créer des installeurs** - NSIS (Windows), DMG (macOS), DEB/RPM (Linux)
+2. **Auto-update** - Vérifier version au démarrage
+3. **Signature de code** - Éviter les warnings antivirus
+
+## 🔧 Intégration avec RPA Vision V3
+
+### Côté Serveur (À Implémenter)
+
+```python
+# server/api/upload_handler.py
+from fastapi import FastAPI, UploadFile, File
+from core.persistence import StorageManager
+from core.models import RawSession
+
+app = FastAPI()
+storage = StorageManager(base_path="data/training")
+
+@app.post("/api/traces/upload")
+async def upload_session(
+    file: UploadFile = File(...),
+    session_id: str = Form(...)
+):
+    # Sauvegarder ZIP
+    zip_path = f"data/training/uploads/{session_id}.zip"
+    with open(zip_path, "wb") as f:
+        f.write(await file.read())
+    
+    # Extraire et valider
+    extract_dir = f"data/training/sessions/{session_id}"
+    with zipfile.ZipFile(zip_path, 'r') as zf:
+        zf.extractall(extract_dir)
+    
+    # Charger RawSession
+    json_path = f"{extract_dir}/{session_id}/{session_id}.json"
+    session = RawSession.load_from_file(Path(json_path))
+    
+    # Valider format
+    assert session.schema_version == "rawsession_v1"
+    
+    # Stocker avec StorageManager
+    storage.save_raw_session(session)
+    
+    return {"status": "success", "session_id": session_id}
+```
+
+### Pipeline de Training
+
+```python
+# training/process_agent_sessions.py
+from core.models import RawSession, ScreenState
+from core.embedding import StateEmbeddingBuilder
+from core.graph import GraphBuilder
+
+def process_agent_session(session_id: str):
+    # 1. Charger RawSession
+    session = storage.load_raw_session(session_id)
+    
+    # 2. Construire ScreenStates
+    screen_states = []
+    for event in session.events:
+        if event.screenshot_id:
+            # Créer ScreenState à partir de screenshot
+            state = build_screen_state_from_event(event, session)
+            screen_states.append(state)
+    
+    # 3. Générer embeddings
+    builder = StateEmbeddingBuilder()
+    for state in screen_states:
+        embedding = builder.build_embedding(state)
+        storage.save_embedding(embedding.vector, state.screen_state_id)
+    
+    # 4. Construire workflow
+    graph_builder = GraphBuilder()
+    workflow = graph_builder.build_from_session(session)
+    storage.save_workflow(workflow)
+    
+    return workflow
+```
+
+## 📊 Comparaison avec RPA Vision V3
+
+| Aspect | Agent V0 | RPA Vision V3 |
+|--------|----------|---------------|
+| **Plateforme** | Windows/macOS/Linux | Linux (serveur) |
+| **Rôle** | Capture données | Analyse + Exécution |
+| **Dépendances** | Légères (mss, pynput) | Lourdes (CLIP, FAISS, Ollama) |
+| **UI** | Tray icon | GUI complète |
+| **Stockage** | Local + Upload | Base de données |
+| **Processing** | Aucun | Embeddings + Matching |
+| **Taille** | ~50 MB (packagé) | ~2 GB (avec modèles) |
+
+## 🎯 Recommandations Prioritaires
+
+### Priorité 1 - Critique
+1. ✅ **Implémenter window_info cross-plateforme** (Windows/macOS)
+2. ✅ **Ajouter check permissions** au démarrage
+3. ✅ **Chiffrer les ZIPs** avant upload
+
+### Priorité 2 - Important
+4. ✅ **Optimiser screenshots** (JPEG, resize)
+5. ✅ **Créer API serveur** pour recevoir uploads
+6. ✅ **Tester sur Windows/macOS** réels
+
+### Priorité 3 - Nice to Have
+7. ✅ **Auto-update** mécanisme
+8. ✅ **Anonymisation** optionnelle screenshots
+9. ✅ **Statistiques** session (durée, nb events)
+10. ✅ **Preview** session avant upload
+
+## 📝 Checklist de Déploiement
+
+### Avant Distribution
+- [ ] Tester sur Windows 10/11
+- [ ] Tester sur macOS 12+
+- [ ] Tester sur Ubuntu 22.04+
+- [ ] Vérifier permissions (Accessibility, Screen Recording)
+- [ ] Créer installeurs (NSIS, DMG, DEB)
+- [ ] Signer le code (éviter warnings antivirus)
+- [ ] Documenter installation (README_AGENT.md)
+- [ ] Créer guide utilisateur (PDF)
+
+### Côté Serveur
+- [ ] Implémenter API `/api/traces/upload`
+- [ ] Configurer stockage (data/training/)
+- [ ] Implémenter pipeline de processing
+- [ ] Ajouter monitoring (Prometheus/Grafana)
+- [ ] Configurer backup automatique
+- [ ] Tester charge (100+ formateurs simultanés)
+
+## 🎉 Conclusion
+
+**L'agent V0 est une excellente base!** L'architecture est propre, modulaire et bien pensée. Les principaux points à adresser sont:
+
+1. **Compatibilité cross-plateforme** (window_info)
+2. **Sécurité** (chiffrement, permissions)
+3. **Optimisation** (compression screenshots)
+
+Avec ces améliorations, l'agent sera prêt pour déploiement production chez les formateurs.
+
+---
+
+**Prochaines étapes suggérées:**
+1. Implémenter `window_info` pour Windows/macOS
+2. Créer l'API serveur d'upload
+3. Tester sur les 3 OS
+4. Créer les installeurs
+5. Déployer en beta chez 2-3 formateurs pilotes
+
+**Besoin d'aide pour implémenter ces améliorations?** Je peux t'aider à coder les parties manquantes! 🚀