Files
rpa_vision_v3/.kiro/steering/product.md
Dom a7de6a488b feat: replay E2E fonctionnel — 25/25 actions, 0 retries, SomEngine via serveur
Validé sur PC Windows (DESKTOP-58D5CAC, 2560x1600) :
- 8 clics résolus visuellement (1 anchor_template, 1 som_text_match, 6 som_vlm)
- Score moyen 0.75, temps moyen 1.6s
- Texte tapé correctement (bonjour, test word, date, email)
- 0 retries, 2 actions non vérifiées (OK)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 14:04:41 +02:00

35 lines
1.6 KiB
Markdown

# Product Overview
RPA Vision V3 is a 100% vision-based workflow automation system that learns from user interactions and automates repetitive tasks through semantic understanding of user interfaces.
## Core Concept
Unlike traditional RPA systems that rely on fixed coordinates, RPA Vision V3 uses:
- **Semantic UI understanding** through computer vision and VLM models
- **Multi-modal embeddings** combining screenshots, text, and UI elements
- **Progressive learning** from observation to autonomous execution
- **Robust matching** that adapts to UI changes
## Key Features
- **Agent V0**: Cross-platform capture tool for recording user sessions
- **Hybrid Detection**: Combines OpenCV, CLIP embeddings, and VLM models
- **Visual Workflow Builder**: Web-based interface for creating and editing workflows
- **Self-Healing**: Automatic adaptation when UI elements change
- **Analytics System**: Performance monitoring and insights
- **Multi-modal Fusion**: Combines visual, textual, and spatial information
## Architecture Layers
1. **RawSession (Layer 0)**: Raw event capture (clicks, keystrokes, screenshots)
2. **ScreenState (Layer 1)**: Multi-modal analysis of screen content
3. **UIElement Detection (Layer 2)**: Semantic detection of interface elements
4. **State Embedding (Layer 3)**: Vector representation for similarity matching
5. **Workflow Graph (Layer 4)**: Executable workflow representation
## Learning Progression
- **OBSERVATION**: 5+ executions to learn patterns
- **COACHING**: 10+ assisted executions with >90% success
- **AUTO_CANDIDATE**: 20+ executions with >95% success rate
- **AUTO_CONFIRMED**: User-validated autonomous execution