Files

Dom a7de6a488b feat: replay E2E fonctionnel — 25/25 actions, 0 retries, SomEngine via serveur

Validé sur PC Windows (DESKTOP-58D5CAC, 2560x1600) :
- 8 clics résolus visuellement (1 anchor_template, 1 som_text_match, 6 som_vlm)
- Score moyen 0.75, temps moyen 1.6s
- Texte tapé correctement (bonjour, test word, date, email)
- 0 retries, 2 actions non vérifiées (OK)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

2026-03-31 14:04:41 +02:00

11 KiB

Raw Blame History

Self-Healing Workflows Design Document

Overview

This document specifies the design for an enhanced self-healing system for RPA Vision V3 that combines existing healing strategies with progressive tolerance relaxation during retries. The system enables workflows to automatically recover from failures by applying increasingly tolerant matching criteria and spatial search parameters.

Architecture

The self-healing system consists of two main integration points:

Target Resolver Healing Integration: Progressive relaxation of matching criteria based on healing attempt counter
Action Executor Retry Integration: Activation of healing mode during retry loops with exponential backoff

Core Components

┌─────────────────────────────────────────────────────────────────┐
│                    Self-Healing Architecture                     │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│  ┌─────────────────┐    ┌─────────────────┐    ┌─────────────┐ │
│  │  Action         │    │  Target         │    │  Healing    │ │
│  │  Executor       │    │  Resolver       │    │  Profiles   │ │
│  │                 │    │                 │    │             │ │
│  │ • Retry Loop    │◄──►│ • healing_      │◄──►│ • min_ratio │ │
│  │ • Backoff       │    │   attempt       │    │ • pad_mul   │ │
│  │ • Counter Mgmt  │    │ • Role Aliases  │    │ • expand_   │ │
│  │                 │    │ • Fuzzy Thresh  │    │   roles     │ │
│  └─────────────────┘    └─────────────────┘    └─────────────┘ │
│           │                       │                       │     │
│           │                       │                       │     │
│  ┌─────────────────┐    ┌─────────────────┐    ┌─────────────┐ │
│  │  Existing       │    │  Spatial        │    │  Metrics    │ │
│  │  Healing        │    │  Search         │    │  Collection │ │
│  │  Strategies     │    │  Enhancement    │    │             │ │
│  │                 │    │                 │    │ • Success   │ │
│  │ • Semantic      │    │ • ROI Padding   │    │   Rates     │ │
│  │ • Spatial       │    │ • Container     │    │ • Attempt   │ │
│  │ • Timing        │    │   Detection     │    │   Counts    │ │
│  │ • Format        │    │                 │    │             │ │
│  └─────────────────┘    └─────────────────┘    └─────────────┘ │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

Components and Interfaces

1. Healing Profile System

The healing profile system provides progressive tolerance based on attempt count:

@dataclass
class HealingProfile:
    """Configuration for healing attempt tolerance levels"""
    min_ratio: float          # Fuzzy matching threshold
    pad_mul: float           # Spatial padding multiplier  
    expand_roles: bool       # Whether to use role aliases
    attempt_level: int       # Healing attempt level (0=strict, 1+=relaxed)

Healing Profiles by Attempt Level:

Level 0 (Normal): min_ratio=0.82, pad_mul=1.0, expand_roles=False
Level 1 (First Healing): min_ratio=0.78, pad_mul=1.3, expand_roles=True
Level 2+ (Desperate): min_ratio=0.72, pad_mul=1.7, expand_roles=True

2. Role Alias System

Semantic role expansion for more tolerant matching:

ROLE_ALIASES = {
    "input": {"input", "textfield", "text_field", "form_input", "forminput", "edit", "textbox"},
    "button": {"button", "submit", "action", "cta"},
    "label": {"label", "text", "data_display"},
    "checkbox": {"checkbox", "check_box", "toggle"},
}

TYPE_ALIASES = {
    "text_input": {"text_input", "input", "textfield"},
    "button": {"button"},
}

3. Enhanced Target Resolver

The TargetResolver class is enhanced with healing capabilities:

class TargetResolver:
    def __init__(self):
        self.healing_attempt = 0  # Healing attempt counter
        
    def _healing_profile(self) -> Dict[str, Any]:
        """Get tolerance profile based on healing attempt level"""
        
    def _find_element_by_text(self, text: str, ui_elements: List[UIElement], 
                             min_ratio: float = 0.65) -> Optional[UIElement]:
        """Enhanced with configurable fuzzy threshold"""
        
    def _resolve_by_role(self, role: str, ...) -> Optional[ResolvedTarget]:
        """Enhanced with role alias expansion"""
        
    def _build_anchor_and_roi_and_container(self, target_spec, ui_elements):
        """Enhanced with configurable padding multipliers"""

4. Enhanced Action Executor

The ActionExecutor integrates healing during retry loops:

class ActionExecutor:
    def execute_edge(self, edge: WorkflowEdge, screen_state: ScreenState) -> ExecutionResult:
        """Enhanced with healing activation during retries"""
        
        # Normal execution attempt
        result = self._execute_action(edge.action, screen_state, context, edge)
        
        # If failed and retries configured, activate healing
        if result.status != ExecutionStatus.SUCCESS and retries > 0:
            for i in range(retries):
                # Apply exponential backoff
                time.sleep((backoff_ms * (2 ** i)) / 1000.0)
                
                # Activate healing attempt on resolver
                self.target_resolver.healing_attempt = i + 1
                
                try:
                    # Retry with healing active
                    result = self.execute_edge(edge, current_state)
                finally:
                    # Always reset healing attempt
                    self.target_resolver.healing_attempt = 0
                    
                if result.status == ExecutionStatus.SUCCESS:
                    return result

Data Models

HealingAttemptMetrics

@dataclass
class HealingAttemptMetrics:
    """Metrics for healing attempt tracking"""
    attempt_level: int
    success: bool
    strategy_used: str
    original_criteria: Dict[str, Any]
    relaxed_criteria: Dict[str, Any]
    duration_ms: float
    timestamp: datetime

ResolutionDetails Enhancement

The existing ResolvedTarget.resolution_details is enhanced with healing information:

resolution_details = {
    "healing_attempt": int,           # Current healing attempt level
    "healing_profile": Dict[str, Any], # Applied healing profile
    "role_aliases_used": List[str],   # Role aliases that were tried
    "fuzzy_threshold_used": float,    # Actual fuzzy threshold used
    "spatial_padding_used": float,    # Spatial padding multiplier used
}

Correctness Properties

A property is a characteristic or behavior that should hold true across all valid executions of a system-essentially, a formal statement about what the system should do. Properties serve as the bridge between human-readable specifications and machine-verifiable correctness guarantees.

Property 1: Healing Attempt Progression

For any target resolution that fails initially, when healing attempts are incremented, the tolerance criteria should become progressively more relaxed Validates: Requirements 7.2, 7.3, 7.4

Property 2: Healing Counter Reset

For any successful action execution after healing attempts, the healing attempt counter should be reset to zero Validates: Requirements 8.4

Property 3: Role Alias Expansion

For any role-based target resolution with healing active, the system should accept elements matching role aliases when expand_roles is true Validates: Requirements 7.3

Property 4: Fuzzy Threshold Relaxation

For any text-based target resolution, the fuzzy matching threshold should decrease (become more tolerant) as healing attempt level increases Validates: Requirements 7.2

Property 5: Spatial Padding Expansion

For any spatial search operation during healing, the padding multiplier should increase the search area proportionally to the healing attempt level Validates: Requirements 7.4

Property 6: Backoff Timing Consistency

For any retry sequence with healing, the delay between attempts should follow exponential backoff pattern regardless of healing success Validates: Requirements 8.2

Property 7: Healing Metrics Recording

For any healing attempt, the system should record metrics including attempt level, success status, and applied criteria Validates: Requirements 7.5

Error Handling

Healing Failure Scenarios

Maximum Attempts Reached: Log all attempted healing profiles and strategies
Invalid Healing Configuration: Fall back to strict matching with warning
Role Alias Resolution Conflicts: Use first successful match with preference logging
Spatial Search Boundary Violations: Clamp to screen boundaries with adjustment logging

Recovery Strategies

Graceful Degradation: If healing system fails, continue with strict matching
Profile Validation: Validate healing profiles before application
Counter Synchronization: Ensure healing counter consistency across components
Metrics Resilience: Continue operation even if metrics collection fails

Testing Strategy

Unit Tests

Test healing profile generation for different attempt levels
Test role alias expansion logic
Test fuzzy threshold adjustment
Test spatial padding calculations
Test healing counter management

Property-Based Tests

Property 1: Healing progression tolerance verification
Property 2: Counter reset consistency
Property 3: Role alias acceptance
Property 4: Fuzzy threshold relaxation
Property 5: Spatial padding expansion
Property 6: Backoff timing verification
Property 7: Metrics recording completeness

Integration Tests

End-to-end healing scenarios with UI changes
Multi-attempt healing sequences
Healing with existing self-healing strategies
Performance impact measurement
Cross-component healing coordination

Performance Tests

Healing overhead measurement (<1ms per attempt)
Memory usage during extended healing sequences
Concurrent healing attempt handling
Cache interaction with healing profiles

11 KiB Raw Blame History