Validé sur PC Windows (DESKTOP-58D5CAC, 2560x1600) : - 8 clics résolus visuellement (1 anchor_template, 1 som_text_match, 6 som_vlm) - Score moyen 0.75, temps moyen 1.6s - Texte tapé correctement (bonjour, test word, date, email) - 0 retries, 2 actions non vérifiées (OK) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
107 lines
8.4 KiB
Markdown
107 lines
8.4 KiB
Markdown
# Requirements Document
|
|
|
|
## Introduction
|
|
|
|
Ce document définit les exigences pour transformer RPA Vision V3 en un système RPA 100% Vision de niveau production. L'objectif est d'améliorer la fiabilité de l'entraînement, la robustesse du matching, et la capacité d'adaptation continue aux changements d'interface utilisateur.
|
|
|
|
Le système actuel présente des gaps critiques : pas de validation de qualité d'entraînement, matching trop simpliste basé uniquement sur similarité globale, pas d'apprentissage continu, et gestion insuffisante des variantes d'écran.
|
|
|
|
## Glossary
|
|
|
|
- **Training_Quality_Validator**: Composant qui évalue la qualité des workflows générés à partir des sessions d'entraînement
|
|
- **Hierarchical_Matcher**: Système de matching multi-niveau (fenêtre → région → élément)
|
|
- **Continuous_Learner**: Module d'apprentissage continu qui adapte les workflows aux changements
|
|
- **Variant_Manager**: Gestionnaire des variantes légitimes d'un même état d'écran
|
|
- **Drift_Detector**: Détecteur de changements significatifs dans l'interface utilisateur
|
|
- **Cluster_Quality_Score**: Métrique de qualité d'un cluster DBSCAN (silhouette, cohésion, séparation)
|
|
- **Embedding_Prototype**: Vecteur représentatif d'un état d'écran (moyenne normalisée des embeddings du cluster)
|
|
- **Temporal_Context**: Séquence d'états précédents influençant le matching actuel
|
|
- **UI_State**: État d'un élément UI (enabled, disabled, checked, loading, error)
|
|
- **Spatial_Relation**: Relation spatiale entre éléments (above, below, left_of, right_of, inside)
|
|
|
|
## Requirements
|
|
|
|
### Requirement 1: Training Quality Validation
|
|
|
|
**User Story:** As a RPA developer, I want to validate the quality of trained workflows, so that I can ensure reliable replay execution.
|
|
|
|
#### Acceptance Criteria
|
|
|
|
1. WHEN a workflow is built from a session THEN the Training_Quality_Validator SHALL compute cluster quality metrics including silhouette score, cohesion, and separation for each detected pattern
|
|
2. WHEN cluster quality score falls below 0.7 THEN the Training_Quality_Validator SHALL flag the cluster as low-confidence and require additional training samples
|
|
3. WHEN computing embedding prototypes THEN the Training_Quality_Validator SHALL detect and exclude outlier embeddings using IQR method with 1.5 threshold
|
|
4. WHEN a workflow contains fewer than 3 observations per node THEN the Training_Quality_Validator SHALL mark the workflow as insufficient-data and prevent AUTO_CANDIDATE transition
|
|
5. WHEN validating a workflow THEN the Training_Quality_Validator SHALL perform cross-validation by holding out 20% of observations and measuring match accuracy
|
|
|
|
### Requirement 2: Hierarchical Matching System
|
|
|
|
**User Story:** As a RPA system, I want to match screens using multiple levels of granularity, so that I can achieve more robust and accurate state recognition.
|
|
|
|
#### Acceptance Criteria
|
|
|
|
1. WHEN matching a screenshot THEN the Hierarchical_Matcher SHALL first match at window level using title pattern and process name with confidence weight 0.2
|
|
2. WHEN window-level match succeeds THEN the Hierarchical_Matcher SHALL match at region level by comparing detected UI regions with stored region templates
|
|
3. WHEN region-level match succeeds THEN the Hierarchical_Matcher SHALL match at element level by comparing individual UI elements within matched regions
|
|
4. WHEN computing final match confidence THEN the Hierarchical_Matcher SHALL combine window, region, and element confidences using weighted formula: 0.2*window + 0.3*region + 0.5*element
|
|
5. WHEN temporal context is available THEN the Hierarchical_Matcher SHALL boost confidence for nodes that are valid successors of the previous matched node by 0.1
|
|
|
|
### Requirement 3: Continuous Learning and Adaptation
|
|
|
|
**User Story:** As a RPA system, I want to continuously learn from new observations, so that I can adapt to UI changes without full retraining.
|
|
|
|
#### Acceptance Criteria
|
|
|
|
1. WHEN a successful execution occurs THEN the Continuous_Learner SHALL update the node embedding prototype using exponential moving average with alpha 0.1
|
|
2. WHEN match confidence drops below 0.85 for 3 consecutive executions THEN the Drift_Detector SHALL flag potential UI drift and notify the user
|
|
3. WHEN UI drift is confirmed THEN the Continuous_Learner SHALL create a new variant for the affected node while preserving the original
|
|
4. WHEN a node has more than 5 variants THEN the Continuous_Learner SHALL consolidate variants by re-clustering with updated parameters
|
|
5. WHEN updating prototypes THEN the Continuous_Learner SHALL maintain version history allowing rollback to previous prototype versions
|
|
|
|
### Requirement 4: Variant and State Management
|
|
|
|
**User Story:** As a RPA developer, I want the system to handle screen variants and dynamic states, so that workflows work reliably across different UI conditions.
|
|
|
|
#### Acceptance Criteria
|
|
|
|
1. WHEN building a workflow THEN the Variant_Manager SHALL detect and group similar but distinct screen states as variants of the same logical node
|
|
2. WHEN a variant differs by more than 0.3 similarity from the primary prototype THEN the Variant_Manager SHALL create a separate variant entry with its own embedding
|
|
3. WHEN matching against a node with variants THEN the Variant_Manager SHALL match against all variants and return the best match with variant identifier
|
|
4. WHEN detecting UI element states THEN the Variant_Manager SHALL identify and store element states including enabled, disabled, checked, unchecked, loading, and error
|
|
5. WHEN an unexpected popup or modal appears THEN the Variant_Manager SHALL detect the overlay and pause execution for user decision or apply configured handling rule
|
|
|
|
### Requirement 5: Advanced UI Understanding
|
|
|
|
**User Story:** As a RPA system, I want to understand UI structure and relationships, so that I can locate elements more reliably even when positions change.
|
|
|
|
#### Acceptance Criteria
|
|
|
|
1. WHEN detecting UI elements THEN the UI_Analyzer SHALL compute spatial relations between elements including above, below, left_of, right_of, and inside
|
|
2. WHEN building a workflow THEN the UI_Analyzer SHALL group related elements into semantic containers such as forms, menus, toolbars, and dialogs
|
|
3. WHEN resolving a target element THEN the UI_Analyzer SHALL use spatial relations as fallback when direct matching fails
|
|
4. WHEN an element cannot be found by primary strategy THEN the UI_Analyzer SHALL search using anchor elements with known spatial relations
|
|
5. WHEN detecting element states THEN the UI_Analyzer SHALL use visual features including color, opacity, and border style to determine enabled, disabled, or loading states
|
|
|
|
### Requirement 6: Training Session Quality
|
|
|
|
**User Story:** As a RPA developer, I want feedback on training session quality, so that I can improve my demonstrations for better workflow reliability.
|
|
|
|
#### Acceptance Criteria
|
|
|
|
1. WHEN a training session is uploaded THEN the Session_Analyzer SHALL compute a quality score based on screenshot clarity, action consistency, and timing patterns
|
|
2. WHEN screenshots have low contrast or blur THEN the Session_Analyzer SHALL flag affected frames and suggest re-recording
|
|
3. WHEN action timing is inconsistent with more than 2x standard deviation THEN the Session_Analyzer SHALL identify potentially problematic transitions
|
|
4. WHEN duplicate or near-duplicate screenshots exceed 30% of session THEN the Session_Analyzer SHALL suggest optimizing capture frequency
|
|
5. WHEN generating quality report THEN the Session_Analyzer SHALL provide actionable recommendations for improving training data
|
|
|
|
### Requirement 7: Execution Robustness
|
|
|
|
**User Story:** As a RPA system, I want robust execution handling, so that workflows can recover from transient failures and unexpected conditions.
|
|
|
|
#### Acceptance Criteria
|
|
|
|
1. WHEN an action fails THEN the Execution_Engine SHALL retry with exponential backoff up to 3 times before marking as failed
|
|
2. WHEN a target element is not found THEN the Execution_Engine SHALL wait up to configured timeout with periodic re-detection before failing
|
|
3. WHEN screen state does not match expected post-condition THEN the Execution_Engine SHALL attempt recovery by re-matching current state to workflow graph
|
|
4. WHEN execution encounters an unknown screen THEN the Execution_Engine SHALL pause and request user guidance with screenshot and context
|
|
5. WHEN recovering from failure THEN the Execution_Engine SHALL log detailed diagnostics including screenshots, match scores, and attempted strategies
|