Files

Dom a7de6a488b feat: replay E2E fonctionnel — 25/25 actions, 0 retries, SomEngine via serveur

Validé sur PC Windows (DESKTOP-58D5CAC, 2560x1600) :
- 8 clics résolus visuellement (1 anchor_template, 1 som_text_match, 6 som_vlm)
- Score moyen 0.75, temps moyen 1.6s
- Texte tapé correctement (bonjour, test word, date, email)
- 0 retries, 2 actions non vérifiées (OK)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

2026-03-31 14:04:41 +02:00

19 KiB

Raw Blame History

Design Document: RPA Analytics & Insights

Overview

This design document describes the architecture and implementation of a comprehensive analytics and insights system for RPA Vision V3. The system collects execution metrics, performs real-time and historical analysis, detects anomalies, generates automated insights, and provides customizable dashboards and reports.

The analytics system is designed to be:

Non-intrusive: Minimal impact on workflow execution performance
Scalable: Handle high-volume metric collection and analysis
Real-time: Provide sub-second latency for live monitoring
Intelligent: Automatic anomaly detection and insight generation
Flexible: Customizable dashboards, reports, and alerts

Architecture

graph TB
    subgraph "Data Collection"
        EC[Execution Collector]
        MC[Metrics Collector]
        RC[Resource Collector]
        Buffer[Async Buffer]
    end
    
    subgraph "Storage Layer"
        TS[Time Series DB]
        MS[Metrics Store]
        AS[Archive Storage]
    end
    
    subgraph "Analytics Engine"
        PA[Performance Analyzer]
        AA[Anomaly Detector]
        IA[Insight Generator]
        CA[Comparative Analyzer]
    end
    
    subgraph "Query & Aggregation"
        QE[Query Engine]
        AG[Aggregator]
        Cache[Query Cache]
    end
    
    subgraph "Presentation"
        API[Analytics API]
        RT[Real-time Stream]
        RG[Report Generator]
        DM[Dashboard Manager]
    end
    
    EC --> Buffer
    MC --> Buffer
    RC --> Buffer
    Buffer --> TS
    Buffer --> MS
    
    TS --> QE
    MS --> QE
    QE --> AG
    AG --> Cache
    
    QE --> PA
    QE --> AA
    QE --> IA
    QE --> CA
    
    PA --> API
    AA --> API
    IA --> API
    CA --> API
    
    API --> RT
    API --> RG
    API --> DM
    
    MS --> AS

Components and Interfaces

1. Metrics Collection (`core/analytics/collection/`)

A. Execution Collector

@dataclass
class ExecutionMetrics:
    """Metrics for a workflow execution."""
    execution_id: str
    workflow_id: str
    started_at: datetime
    completed_at: Optional[datetime]
    duration_ms: Optional[float]
    status: str  # 'running', 'completed', 'failed'
    steps_total: int
    steps_completed: int
    steps_failed: int
    error_message: Optional[str] = None
    context: Dict[str, Any] = field(default_factory=dict)

@dataclass
class StepMetrics:
    """Metrics for a workflow step."""
    step_id: str
    execution_id: str
    workflow_id: str
    node_id: str
    action_type: str
    target_element: str
    started_at: datetime
    completed_at: datetime
    duration_ms: float
    status: str
    confidence_score: float
    retry_count: int = 0
    error_details: Optional[str] = None

class MetricsCollector:
    """Collects metrics from workflow executions."""
    
    def __init__(self, buffer_size: int = 1000, flush_interval_sec: float = 5.0):
        self.buffer_size = buffer_size
        self.flush_interval = flush_interval_sec
        self._buffer: List[Union[ExecutionMetrics, StepMetrics]] = []
        self._lock = threading.Lock()
        self._flush_thread: Optional[threading.Thread] = None
    
    def record_execution_start(self, execution_id: str, workflow_id: str) -> None:
        """Record the start of a workflow execution."""
        
    def record_execution_complete(
        self,
        execution_id: str,
        status: str,
        error_message: Optional[str] = None
    ) -> None:
        """Record the completion of a workflow execution."""
        
    def record_step(self, step_metrics: StepMetrics) -> None:
        """Record metrics for a completed step."""
        
    def flush(self) -> None:
        """Flush buffered metrics to storage."""

B. Resource Collector

@dataclass
class ResourceMetrics:
    """System resource usage metrics."""
    timestamp: datetime
    workflow_id: Optional[str]
    execution_id: Optional[str]
    cpu_percent: float
    memory_mb: float
    gpu_utilization: float
    gpu_memory_mb: float
    disk_io_mb: float

class ResourceCollector:
    """Collects system resource usage metrics."""
    
    def __init__(self, sample_interval_sec: float = 1.0):
        self.sample_interval = sample_interval_sec
        self._running = False
        self._thread: Optional[threading.Thread] = None
    
    def start(self) -> None:
        """Start collecting resource metrics."""
        
    def stop(self) -> None:
        """Stop collecting resource metrics."""
        
    def get_current_metrics(self) -> ResourceMetrics:
        """Get current resource usage."""

2. Storage Layer (`core/analytics/storage/`)

A. Time Series Store

class TimeSeriesStore:
    """Store for time-series metrics data."""
    
    def __init__(self, storage_path: Path):
        self.storage_path = storage_path
        # Use SQLite with time-series optimizations
        self.db_path = storage_path / 'timeseries.db'
    
    def write_metrics(self, metrics: List[Union[ExecutionMetrics, StepMetrics]]) -> None:
        """Write metrics to time-series storage."""
        
    def query_range(
        self,
        start_time: datetime,
        end_time: datetime,
        workflow_id: Optional[str] = None,
        metric_types: Optional[List[str]] = None
    ) -> List[Dict]:
        """Query metrics within a time range."""
        
    def aggregate(
        self,
        metric: str,
        aggregation: str,  # 'avg', 'sum', 'count', 'min', 'max'
        group_by: List[str],
        start_time: datetime,
        end_time: datetime,
        filters: Optional[Dict] = None
    ) -> List[Dict]:
        """Aggregate metrics with grouping."""

B. Archive Storage

class ArchiveStorage:
    """Archive storage for old metrics."""
    
    def __init__(self, storage_path: Path):
        self.storage_path = storage_path
        self.archive_path = storage_path / 'archive'
    
    def archive_data(
        self,
        data: List[Dict],
        archive_date: datetime
    ) -> str:
        """Archive data with compression."""
        
    def query_archive(
        self,
        start_date: datetime,
        end_date: datetime,
        filters: Optional[Dict] = None
    ) -> List[Dict]:
        """Query archived data."""
        
    def apply_retention_policy(
        self,
        policy: Dict[str, int]  # metric_type -> retention_days
    ) -> int:
        """Apply retention policy and return number of records deleted."""

3. Analytics Engine (`core/analytics/engine/`)

A. Performance Analyzer

@dataclass
class PerformanceStats:
    """Performance statistics."""
    workflow_id: str
    time_period: str
    execution_count: int
    avg_duration_ms: float
    median_duration_ms: float
    p95_duration_ms: float
    p99_duration_ms: float
    min_duration_ms: float
    max_duration_ms: float
    std_dev_ms: float
    slowest_steps: List[Dict]

class PerformanceAnalyzer:
    """Analyzes workflow performance."""
    
    def __init__(self, time_series_store: TimeSeriesStore):
        self.store = time_series_store
    
    def analyze_workflow(
        self,
        workflow_id: str,
        start_time: datetime,
        end_time: datetime
    ) -> PerformanceStats:
        """Analyze performance for a workflow."""
        
    def identify_bottlenecks(
        self,
        workflow_id: str,
        threshold_percentile: float = 0.95
    ) -> List[Dict]:
        """Identify bottleneck steps in a workflow."""
        
    def detect_performance_degradation(
        self,
        workflow_id: str,
        baseline_period: timedelta,
        current_period: timedelta,
        threshold_percent: float = 20.0
    ) -> Optional[Dict]:
        """Detect performance degradation compared to baseline."""

B. Anomaly Detector

@dataclass
class Anomaly:
    """Detected anomaly."""
    anomaly_id: str
    workflow_id: str
    metric_name: str
    detected_at: datetime
    severity: float  # 0.0 to 1.0
    deviation: float
    baseline_value: float
    actual_value: float
    description: str
    recommended_action: Optional[str] = None

class AnomalyDetector:
    """Detects anomalies in workflow execution."""
    
    def __init__(
        self,
        time_series_store: TimeSeriesStore,
        sensitivity: float = 2.0  # Standard deviations
    ):
        self.store = time_series_store
        self.sensitivity = sensitivity
        self.baselines: Dict[str, Dict] = {}
    
    def detect_anomalies(
        self,
        workflow_id: str,
        metrics: List[Dict]
    ) -> List[Anomaly]:
        """Detect anomalies in metrics."""
        
    def update_baseline(
        self,
        workflow_id: str,
        stable_period_days: int = 7
    ) -> None:
        """Update baseline from stable period."""
        
    def correlate_anomalies(
        self,
        anomalies: List[Anomaly],
        time_window_minutes: int = 30
    ) -> List[List[Anomaly]]:
        """Correlate related anomalies."""

C. Insight Generator

@dataclass
class Insight:
    """Generated insight."""
    insight_id: str
    workflow_id: str
    category: str  # 'performance', 'reliability', 'resource', 'best_practice'
    title: str
    description: str
    recommendation: str
    expected_impact: str
    ease_of_implementation: str  # 'easy', 'medium', 'hard'
    priority_score: float
    supporting_data: Dict[str, Any]
    created_at: datetime

class InsightGenerator:
    """Generates automated insights."""
    
    def __init__(
        self,
        performance_analyzer: PerformanceAnalyzer,
        anomaly_detector: AnomalyDetector
    ):
        self.performance_analyzer = performance_analyzer
        self.anomaly_detector = anomaly_detector
    
    def generate_insights(
        self,
        workflow_id: str,
        analysis_period_days: int = 30
    ) -> List[Insight]:
        """Generate insights for a workflow."""
        
    def prioritize_insights(
        self,
        insights: List[Insight]
    ) -> List[Insight]:
        """Prioritize insights by impact and ease."""
        
    def track_insight_implementation(
        self,
        insight_id: str,
        implemented: bool,
        actual_impact: Optional[Dict] = None
    ) -> None:
        """Track insight implementation and measure impact."""

4. Query Engine (`core/analytics/query/`)

class QueryEngine:
    """Query engine for analytics data."""
    
    def __init__(
        self,
        time_series_store: TimeSeriesStore,
        archive_storage: ArchiveStorage,
        cache_size: int = 100
    ):
        self.ts_store = time_series_store
        self.archive = archive_storage
        self.cache = LRUCache(cache_size)
    
    def query(
        self,
        query: Dict[str, Any],
        use_cache: bool = True
    ) -> List[Dict]:
        """Execute a query against analytics data."""
        
    def aggregate(
        self,
        metric: str,
        aggregation: str,
        group_by: List[str],
        filters: Dict[str, Any],
        time_range: Tuple[datetime, datetime]
    ) -> List[Dict]:
        """Aggregate metrics with grouping."""
        
    def compare(
        self,
        workflow_ids: List[str],
        metrics: List[str],
        time_range: Tuple[datetime, datetime]
    ) -> Dict[str, Dict]:
        """Compare metrics across workflows."""

5. Real-time Analytics (`core/analytics/realtime/`)

class RealtimeAnalytics:
    """Real-time analytics for active workflows."""
    
    def __init__(self, metrics_collector: MetricsCollector):
        self.collector = metrics_collector
        self.active_executions: Dict[str, ExecutionMetrics] = {}
        self.subscribers: Dict[str, List[Callable]] = {}
    
    def track_execution(self, execution_id: str, workflow_id: str) -> None:
        """Start tracking an execution in real-time."""
        
    def update_progress(
        self,
        execution_id: str,
        current_step: int,
        total_steps: int
    ) -> None:
        """Update execution progress."""
        
    def get_live_metrics(self, execution_id: str) -> Dict[str, Any]:
        """Get live metrics for an execution."""
        
    def subscribe(
        self,
        execution_id: str,
        callback: Callable[[Dict], None]
    ) -> None:
        """Subscribe to real-time updates."""

Data Models

Metrics Schema

-- Execution metrics table
CREATE TABLE execution_metrics (
    execution_id TEXT PRIMARY KEY,
    workflow_id TEXT NOT NULL,
    started_at TIMESTAMP NOT NULL,
    completed_at TIMESTAMP,
    duration_ms REAL,
    status TEXT NOT NULL,
    steps_total INTEGER,
    steps_completed INTEGER,
    steps_failed INTEGER,
    error_message TEXT,
    context JSON
);

CREATE INDEX idx_workflow_time ON execution_metrics(workflow_id, started_at);
CREATE INDEX idx_status ON execution_metrics(status);

-- Step metrics table
CREATE TABLE step_metrics (
    step_id TEXT PRIMARY KEY,
    execution_id TEXT NOT NULL,
    workflow_id TEXT NOT NULL,
    node_id TEXT NOT NULL,
    action_type TEXT NOT NULL,
    target_element TEXT,
    started_at TIMESTAMP NOT NULL,
    completed_at TIMESTAMP NOT NULL,
    duration_ms REAL NOT NULL,
    status TEXT NOT NULL,
    confidence_score REAL,
    retry_count INTEGER DEFAULT 0,
    error_details TEXT,
    FOREIGN KEY (execution_id) REFERENCES execution_metrics(execution_id)
);

CREATE INDEX idx_execution ON step_metrics(execution_id);
CREATE INDEX idx_workflow_action ON step_metrics(workflow_id, action_type);

-- Resource metrics table
CREATE TABLE resource_metrics (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    timestamp TIMESTAMP NOT NULL,
    workflow_id TEXT,
    execution_id TEXT,
    cpu_percent REAL NOT NULL,
    memory_mb REAL NOT NULL,
    gpu_utilization REAL,
    gpu_memory_mb REAL,
    disk_io_mb REAL
);

CREATE INDEX idx_resource_time ON resource_metrics(timestamp);
CREATE INDEX idx_resource_workflow ON resource_metrics(workflow_id, timestamp);

Correctness Properties

A property is a characteristic or behavior that should hold true across all valid executions of a system-essentially, a formal statement about what the system should do. Properties serve as the bridge between human-readable specifications and machine-verifiable correctness guarantees.

Property 1: Metrics completeness

For any workflow execution, all required metrics (execution_id, workflow_id, timestamps, duration) SHALL be recorded. Validates: Requirements 1.1, 1.4

Property 2: Step metrics integrity

For any completed step, the step metrics SHALL include action_type, target_element, and execution_result. Validates: Requirements 1.2

Property 3: Failure recording completeness

For any failed execution, the failure reason, error details, and context SHALL be recorded. Validates: Requirements 1.3

Property 4: Async persistence guarantee

For any buffered metrics, they SHALL eventually be persisted to storage within the flush interval. Validates: Requirements 1.5

Property 5: Statistical accuracy

For any dataset of execution times, the calculated average, median, p95, and p99 SHALL match standard statistical definitions. Validates: Requirements 2.1

Property 6: Bottleneck identification correctness

For any workflow, the identified bottleneck steps SHALL be the steps with the highest execution times. Validates: Requirements 2.3

Property 7: Performance degradation detection

For any workflow where execution time increases above threshold, an alert SHALL be generated. Validates: Requirements 2.4

Property 8: Success rate calculation accuracy

For any set of executions, the success rate SHALL equal (successful_count / total_count) * 100. Validates: Requirements 3.1

Property 9: Failure categorization completeness

For any set of failures, all failures SHALL be assigned to a category. Validates: Requirements 3.2

Property 10: Anomaly detection sensitivity

For any metric value that deviates from baseline by more than sensitivity threshold, an anomaly SHALL be detected. Validates: Requirements 4.1

Property 11: Severity score validity

For any detected anomaly, the severity score SHALL be between 0.0 and 1.0. Validates: Requirements 4.2

Property 12: Resource tracking completeness

For any workflow execution, CPU, memory, and GPU metrics SHALL be tracked. Validates: Requirements 5.1

Property 13: Insight generation consistency

For any workflow with performance issues, at least one actionable insight SHALL be generated. Validates: Requirements 6.1

Property 14: Insight prioritization correctness

For any set of insights, they SHALL be ordered by priority_score in descending order. Validates: Requirements 6.4

Property 15: Filter application correctness

For any query with filters, only records matching all filter criteria SHALL be returned. Validates: Requirements 7.1

Property 16: Export format validity

For any report export, the output SHALL be valid according to the target format specification (PDF, CSV, JSON). Validates: Requirements 7.3

Property 17: Comparison calculation accuracy

For any two workflows being compared, the difference calculations SHALL be mathematically correct. Validates: Requirements 8.1

Property 18: Real-time latency guarantee

For any real-time metric request, the response SHALL be delivered within 1 second. Validates: Requirements 9.1

Property 19: Retention policy enforcement

For any data older than its retention period, it SHALL be archived or deleted according to policy. Validates: Requirements 10.2

Property 20: Archive data integrity

For any archived data, it SHALL be retrievable and match the original data when decompressed. Validates: Requirements 10.3

Integration Points

With Execution Loop

Hook into execution start/complete events
Collect step-level metrics during execution
Minimal performance impact (<1% overhead)

With Self-Healing System

Integrate recovery metrics
Track recovery success rates
Correlate failures with recovery attempts

With Dashboard

Provide REST API for metrics
WebSocket for real-time updates
Export endpoints for reports

Performance Considerations

Optimization Strategies

Async Collection: Buffer metrics and persist asynchronously
Query Caching: Cache frequently accessed aggregations
Index Optimization: Strategic indexes on time-series data
Data Partitioning: Partition by time for efficient queries
Archive Strategy: Move old data to compressed archive

Scalability Targets

Handle 1000+ workflow executions per hour
Support 10,000+ steps per hour
Real-time queries < 1 second
Historical queries < 5 seconds
Storage growth < 1GB per month

Testing Strategy

Property-Based Testing

Use Hypothesis to test correctness properties:

Generate random execution data
Verify statistical calculations
Test anomaly detection with synthetic data
Validate query filters and aggregations

Integration Testing

End-to-end metric collection and analysis
Real-time analytics with simulated workflows
Archive and retention policy testing
Dashboard integration testing

Performance Testing

Load testing with high metric volume
Query performance benchmarking
Real-time latency testing
Storage growth monitoring

19 KiB Raw Blame History