Validé sur PC Windows (DESKTOP-58D5CAC, 2560x1600) : - 8 clics résolus visuellement (1 anchor_template, 1 som_text_match, 6 som_vlm) - Score moyen 0.75, temps moyen 1.6s - Texte tapé correctement (bonjour, test word, date, email) - 0 retries, 2 actions non vérifiées (OK) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
35 lines
1.6 KiB
Markdown
35 lines
1.6 KiB
Markdown
# Product Overview
|
|
|
|
RPA Vision V3 is a 100% vision-based workflow automation system that learns from user interactions and automates repetitive tasks through semantic understanding of user interfaces.
|
|
|
|
## Core Concept
|
|
|
|
Unlike traditional RPA systems that rely on fixed coordinates, RPA Vision V3 uses:
|
|
- **Semantic UI understanding** through computer vision and VLM models
|
|
- **Multi-modal embeddings** combining screenshots, text, and UI elements
|
|
- **Progressive learning** from observation to autonomous execution
|
|
- **Robust matching** that adapts to UI changes
|
|
|
|
## Key Features
|
|
|
|
- **Agent V0**: Cross-platform capture tool for recording user sessions
|
|
- **Hybrid Detection**: Combines OpenCV, CLIP embeddings, and VLM models
|
|
- **Visual Workflow Builder**: Web-based interface for creating and editing workflows
|
|
- **Self-Healing**: Automatic adaptation when UI elements change
|
|
- **Analytics System**: Performance monitoring and insights
|
|
- **Multi-modal Fusion**: Combines visual, textual, and spatial information
|
|
|
|
## Architecture Layers
|
|
|
|
1. **RawSession (Layer 0)**: Raw event capture (clicks, keystrokes, screenshots)
|
|
2. **ScreenState (Layer 1)**: Multi-modal analysis of screen content
|
|
3. **UIElement Detection (Layer 2)**: Semantic detection of interface elements
|
|
4. **State Embedding (Layer 3)**: Vector representation for similarity matching
|
|
5. **Workflow Graph (Layer 4)**: Executable workflow representation
|
|
|
|
## Learning Progression
|
|
|
|
- **OBSERVATION**: 5+ executions to learn patterns
|
|
- **COACHING**: 10+ assisted executions with >90% success
|
|
- **AUTO_CANDIDATE**: 20+ executions with >95% success rate
|
|
- **AUTO_CONFIRMED**: User-validated autonomous execution |