Files
rpa_vision_v3/.kiro/specs/gpu-resource-manager/tasks.md
Dom a7de6a488b feat: replay E2E fonctionnel — 25/25 actions, 0 retries, SomEngine via serveur
Validé sur PC Windows (DESKTOP-58D5CAC, 2560x1600) :
- 8 clics résolus visuellement (1 anchor_template, 1 som_text_match, 6 som_vlm)
- Score moyen 0.75, temps moyen 1.6s
- Texte tapé correctement (bonjour, test word, date, email)
- 0 retries, 2 actions non vérifiées (OK)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 14:04:41 +02:00

6.2 KiB

Implementation Plan

  • 1. Set up project structure and core interfaces

    • 1.1 Create core/gpu/gpu_resource_manager.py with GPUResourceManager class skeleton
      • Define ExecutionMode, ModelState enums
      • Define GPUResourceConfig, GPUResourceStatus, VRAMInfo dataclasses
      • Implement singleton pattern
      • Requirements: 2.1, 5.1, 5.2, 5.3
    • 1.2 Create core/gpu/ollama_manager.py with OllamaManager class
      • Implement Ollama API client (load_model, unload_model, is_model_loaded)
      • Add connection health check
      • Requirements: 1.1, 1.2, 1.5
    • 1.3 Create core/gpu/vram_monitor.py with VRAMMonitor class
      • Implement pynvml wrapper for VRAM queries
      • Add fallback for systems without GPU
      • Requirements: 1.4, 2.1, 6.1
    • 1.4 Write property test for OllamaManager
      • Property 10: ensure_vlm_loaded blocking
      • Property 11: ensure_vlm_unloaded blocking
      • Validates: Requirements 5.1, 5.2
  • 2. Implement VLM lifecycle management

    • 2.1 Implement ensure_vlm_loaded() in GPUResourceManager
      • Add async loading with timeout
      • Implement retry logic (max 3 retries)
      • Queue concurrent requests
      • Requirements: 5.1, 5.4, 6.2
    • 2.2 Implement ensure_vlm_unloaded() in GPUResourceManager
      • Add async unloading with timeout
      • Verify VRAM decrease
      • Requirements: 5.2, 1.4
    • 2.3 Write property test for VLM lifecycle
      • Property 4: VRAM decrease on VLM unload
      • Validates: Requirements 1.4
    • 2.4 Write property test for blocking behavior
      • Property 10: ensure_vlm_loaded blocking
      • Property 11: ensure_vlm_unloaded blocking
      • Validates: Requirements 5.1, 5.2
  • 3. Implement CLIP device management

    • 3.1 Create core/gpu/clip_manager.py with CLIPManager class
      • Implement device detection and migration
      • Add pipeline reinitialization
      • Requirements: 3.1, 3.3, 3.4
    • 3.2 Implement migrate_clip_to_gpu() and migrate_clip_to_cpu()
      • Check VRAM availability before GPU migration
      • Handle migration failures gracefully
      • Requirements: 3.1, 3.2, 3.4
    • 3.3 Write property test for CLIP device
      • Property 12: get_clip_device validity
      • Validates: Requirements 5.3
    • 3.4 Write property test for embedding consistency
      • Property 7: Embedding pipeline consistency
      • Validates: Requirements 3.3
  • 4. Checkpoint - Ensure all tests pass

    • Ensure all tests pass, ask the user if questions arise.
  • 5. Implement execution mode management

    • 5.1 Implement set_execution_mode() with automatic resource management
      • AUTOPILOT: unload VLM, migrate CLIP to GPU
      • RECORDING: load VLM, migrate CLIP to CPU
      • IDLE: no automatic changes
      • Requirements: 1.1, 1.2, 1.3, 3.1, 3.2
    • 5.2 Implement mode transition coordination
      • Ensure CLIP migrates before VLM loads
      • Handle concurrent mode changes
      • Requirements: 3.2, 5.4
    • 5.3 Write property test for mode transitions
      • Property 1: Mode transition triggers VLM unload
      • Property 2: Mode transition triggers VLM load
      • Validates: Requirements 1.1, 1.2
    • 5.4 Write property test for CLIP in AUTOPILOT
      • Property 3: CLIP on GPU in AUTOPILOT
      • Validates: Requirements 1.3, 3.1
    • 5.5 Write property test for migration ordering
      • Property 6: CLIP migration ordering
      • Validates: Requirements 3.2
  • 6. Implement idle timeout management

    • 6.1 Add idle timeout tracking in GPUResourceManager
      • Track last VLM request timestamp
      • Implement background timer for timeout check
      • Requirements: 4.1, 4.3
    • 6.2 Implement on-demand VLM loading
      • Intercept VLM requests when unloaded
      • Load VLM before processing request
      • Requirements: 4.2
    • 6.3 Write property test for idle timeout
      • Property 8: Idle timeout behavior
      • Validates: Requirements 4.1, 4.3
    • 6.4 Write property test for on-demand loading
      • Property 9: On-demand VLM loading
      • Validates: Requirements 4.2
  • 7. Implement monitoring and events

    • 7.1 Implement get_status() returning complete GPUResourceStatus
      • Include all fields: vram, vlm_state, clip_device, execution_mode
      • Requirements: 2.1
    • 7.2 Implement event emission system
      • resource_changed, mode_changed, idle_unload events
      • VRAM change threshold detection (100 MB)
      • Requirements: 2.2, 2.3, 4.4
    • 7.3 Write property test for status completeness
      • Property 5: Status query completeness
      • Validates: Requirements 2.1
  • 8. Implement error handling and degraded mode

    • 8.1 Implement graceful degradation for missing GPU
      • Detect GPU availability at startup
      • Force CPU-only mode if no GPU
      • Requirements: 6.1
    • 8.2 Implement Ollama unavailable handling
      • Connection retry logic
      • Degraded mode flag and reason
      • Requirements: 1.5, 6.2, 6.3
    • 8.3 Implement VRAM insufficient error handling
      • Check VRAM before operations
      • Return informative errors
      • Requirements: 6.4
    • 8.4 Write property test for sequential processing
      • Property 13: Sequential operation processing
      • Validates: Requirements 5.4
  • 9. Checkpoint - Ensure all tests pass

    • Ensure all tests pass, ask the user if questions arise.
  • 10. Integration with existing components

    • 10.1 Integrate GPUResourceManager with ExecutionLoop
      • Call set_execution_mode() on mode changes
      • Use ensure_vlm_loaded() before VLM operations
      • Requirements: 1.1, 1.2, 4.2
    • 10.2 Integrate with UIDetector
      • Check VLM availability before classification
      • Handle degraded mode gracefully
      • Requirements: 1.5, 6.2
    • 10.3 Integrate with FusionEngine/CLIP embedding
      • Use CLIPManager for device-aware embeddings
      • Reinitialize on device change
      • Requirements: 3.3
    • 10.4 Update core/config.py with GPU resource configuration
      • Add GPUResourceConfig to AppConfig
      • Support environment variables
      • Requirements: 4.3
  • 11. Final Checkpoint - Ensure all tests pass

    • Ensure all tests pass, ask the user if questions arise.