Files

Dom a7de6a488b feat: replay E2E fonctionnel — 25/25 actions, 0 retries, SomEngine via serveur

Validé sur PC Windows (DESKTOP-58D5CAC, 2560x1600) :
- 8 clics résolus visuellement (1 anchor_template, 1 som_text_match, 6 som_vlm)
- Score moyen 0.75, temps moyen 1.6s
- Texte tapé correctement (bonjour, test word, date, email)
- 0 retries, 2 actions non vérifiées (OK)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

2026-03-31 14:04:41 +02:00

8.4 KiB

Raw Blame History

Requirements Document

Introduction

Ce document définit les exigences pour transformer RPA Vision V3 en un système RPA 100% Vision de niveau production. L'objectif est d'améliorer la fiabilité de l'entraînement, la robustesse du matching, et la capacité d'adaptation continue aux changements d'interface utilisateur.

Le système actuel présente des gaps critiques : pas de validation de qualité d'entraînement, matching trop simpliste basé uniquement sur similarité globale, pas d'apprentissage continu, et gestion insuffisante des variantes d'écran.

Glossary

Training_Quality_Validator: Composant qui évalue la qualité des workflows générés à partir des sessions d'entraînement
Hierarchical_Matcher: Système de matching multi-niveau (fenêtre → région → élément)
Continuous_Learner: Module d'apprentissage continu qui adapte les workflows aux changements
Variant_Manager: Gestionnaire des variantes légitimes d'un même état d'écran
Drift_Detector: Détecteur de changements significatifs dans l'interface utilisateur
Cluster_Quality_Score: Métrique de qualité d'un cluster DBSCAN (silhouette, cohésion, séparation)
Embedding_Prototype: Vecteur représentatif d'un état d'écran (moyenne normalisée des embeddings du cluster)
Temporal_Context: Séquence d'états précédents influençant le matching actuel
UI_State: État d'un élément UI (enabled, disabled, checked, loading, error)
Spatial_Relation: Relation spatiale entre éléments (above, below, left_of, right_of, inside)

Requirements

Requirement 1: Training Quality Validation

User Story: As a RPA developer, I want to validate the quality of trained workflows, so that I can ensure reliable replay execution.

Acceptance Criteria

WHEN a workflow is built from a session THEN the Training_Quality_Validator SHALL compute cluster quality metrics including silhouette score, cohesion, and separation for each detected pattern
WHEN cluster quality score falls below 0.7 THEN the Training_Quality_Validator SHALL flag the cluster as low-confidence and require additional training samples
WHEN computing embedding prototypes THEN the Training_Quality_Validator SHALL detect and exclude outlier embeddings using IQR method with 1.5 threshold
WHEN a workflow contains fewer than 3 observations per node THEN the Training_Quality_Validator SHALL mark the workflow as insufficient-data and prevent AUTO_CANDIDATE transition
WHEN validating a workflow THEN the Training_Quality_Validator SHALL perform cross-validation by holding out 20% of observations and measuring match accuracy

Requirement 2: Hierarchical Matching System

User Story: As a RPA system, I want to match screens using multiple levels of granularity, so that I can achieve more robust and accurate state recognition.

Acceptance Criteria

WHEN matching a screenshot THEN the Hierarchical_Matcher SHALL first match at window level using title pattern and process name with confidence weight 0.2
WHEN window-level match succeeds THEN the Hierarchical_Matcher SHALL match at region level by comparing detected UI regions with stored region templates
WHEN region-level match succeeds THEN the Hierarchical_Matcher SHALL match at element level by comparing individual UI elements within matched regions
WHEN computing final match confidence THEN the Hierarchical_Matcher SHALL combine window, region, and element confidences using weighted formula: 0.2window + 0.3region + 0.5*element
WHEN temporal context is available THEN the Hierarchical_Matcher SHALL boost confidence for nodes that are valid successors of the previous matched node by 0.1

Requirement 3: Continuous Learning and Adaptation

User Story: As a RPA system, I want to continuously learn from new observations, so that I can adapt to UI changes without full retraining.

Acceptance Criteria

WHEN a successful execution occurs THEN the Continuous_Learner SHALL update the node embedding prototype using exponential moving average with alpha 0.1
WHEN match confidence drops below 0.85 for 3 consecutive executions THEN the Drift_Detector SHALL flag potential UI drift and notify the user
WHEN UI drift is confirmed THEN the Continuous_Learner SHALL create a new variant for the affected node while preserving the original
WHEN a node has more than 5 variants THEN the Continuous_Learner SHALL consolidate variants by re-clustering with updated parameters
WHEN updating prototypes THEN the Continuous_Learner SHALL maintain version history allowing rollback to previous prototype versions

Requirement 4: Variant and State Management

User Story: As a RPA developer, I want the system to handle screen variants and dynamic states, so that workflows work reliably across different UI conditions.

Acceptance Criteria

WHEN building a workflow THEN the Variant_Manager SHALL detect and group similar but distinct screen states as variants of the same logical node
WHEN a variant differs by more than 0.3 similarity from the primary prototype THEN the Variant_Manager SHALL create a separate variant entry with its own embedding
WHEN matching against a node with variants THEN the Variant_Manager SHALL match against all variants and return the best match with variant identifier
WHEN detecting UI element states THEN the Variant_Manager SHALL identify and store element states including enabled, disabled, checked, unchecked, loading, and error
WHEN an unexpected popup or modal appears THEN the Variant_Manager SHALL detect the overlay and pause execution for user decision or apply configured handling rule

Requirement 5: Advanced UI Understanding

User Story: As a RPA system, I want to understand UI structure and relationships, so that I can locate elements more reliably even when positions change.

Acceptance Criteria

WHEN detecting UI elements THEN the UI_Analyzer SHALL compute spatial relations between elements including above, below, left_of, right_of, and inside
WHEN building a workflow THEN the UI_Analyzer SHALL group related elements into semantic containers such as forms, menus, toolbars, and dialogs
WHEN resolving a target element THEN the UI_Analyzer SHALL use spatial relations as fallback when direct matching fails
WHEN an element cannot be found by primary strategy THEN the UI_Analyzer SHALL search using anchor elements with known spatial relations
WHEN detecting element states THEN the UI_Analyzer SHALL use visual features including color, opacity, and border style to determine enabled, disabled, or loading states

Requirement 6: Training Session Quality

User Story: As a RPA developer, I want feedback on training session quality, so that I can improve my demonstrations for better workflow reliability.

Acceptance Criteria

WHEN a training session is uploaded THEN the Session_Analyzer SHALL compute a quality score based on screenshot clarity, action consistency, and timing patterns
WHEN screenshots have low contrast or blur THEN the Session_Analyzer SHALL flag affected frames and suggest re-recording
WHEN action timing is inconsistent with more than 2x standard deviation THEN the Session_Analyzer SHALL identify potentially problematic transitions
WHEN duplicate or near-duplicate screenshots exceed 30% of session THEN the Session_Analyzer SHALL suggest optimizing capture frequency
WHEN generating quality report THEN the Session_Analyzer SHALL provide actionable recommendations for improving training data

Requirement 7: Execution Robustness

User Story: As a RPA system, I want robust execution handling, so that workflows can recover from transient failures and unexpected conditions.

Acceptance Criteria

WHEN an action fails THEN the Execution_Engine SHALL retry with exponential backoff up to 3 times before marking as failed
WHEN a target element is not found THEN the Execution_Engine SHALL wait up to configured timeout with periodic re-detection before failing
WHEN screen state does not match expected post-condition THEN the Execution_Engine SHALL attempt recovery by re-matching current state to workflow graph
WHEN execution encounters an unknown screen THEN the Execution_Engine SHALL pause and request user guidance with screenshot and context
WHEN recovering from failure THEN the Execution_Engine SHALL log detailed diagnostics including screenshots, match scores, and attempted strategies

8.4 KiB Raw Blame History

Requirements Document

Introduction

Glossary

Requirements

Requirement 1: Training Quality Validation

Acceptance Criteria

Requirement 2: Hierarchical Matching System

Acceptance Criteria

Requirement 3: Continuous Learning and Adaptation

Acceptance Criteria

Requirement 4: Variant and State Management

Acceptance Criteria

Requirement 5: Advanced UI Understanding

Acceptance Criteria

Requirement 6: Training Session Quality

Acceptance Criteria

Requirement 7: Execution Robustness

Acceptance Criteria

8.4 KiB

Raw Blame History