Validé sur PC Windows (DESKTOP-58D5CAC, 2560x1600) : - 8 clics résolus visuellement (1 anchor_template, 1 som_text_match, 6 som_vlm) - Score moyen 0.75, temps moyen 1.6s - Texte tapé correctement (bonjour, test word, date, email) - 0 retries, 2 actions non vérifiées (OK) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
6.2 KiB
6.2 KiB
Implementation Plan
-
1. Set up project structure and core interfaces
- 1.1 Create core/gpu/gpu_resource_manager.py with GPUResourceManager class skeleton
- Define ExecutionMode, ModelState enums
- Define GPUResourceConfig, GPUResourceStatus, VRAMInfo dataclasses
- Implement singleton pattern
- Requirements: 2.1, 5.1, 5.2, 5.3
- 1.2 Create core/gpu/ollama_manager.py with OllamaManager class
- Implement Ollama API client (load_model, unload_model, is_model_loaded)
- Add connection health check
- Requirements: 1.1, 1.2, 1.5
- 1.3 Create core/gpu/vram_monitor.py with VRAMMonitor class
- Implement pynvml wrapper for VRAM queries
- Add fallback for systems without GPU
- Requirements: 1.4, 2.1, 6.1
- 1.4 Write property test for OllamaManager
- Property 10: ensure_vlm_loaded blocking
- Property 11: ensure_vlm_unloaded blocking
- Validates: Requirements 5.1, 5.2
- 1.1 Create core/gpu/gpu_resource_manager.py with GPUResourceManager class skeleton
-
2. Implement VLM lifecycle management
- 2.1 Implement ensure_vlm_loaded() in GPUResourceManager
- Add async loading with timeout
- Implement retry logic (max 3 retries)
- Queue concurrent requests
- Requirements: 5.1, 5.4, 6.2
- 2.2 Implement ensure_vlm_unloaded() in GPUResourceManager
- Add async unloading with timeout
- Verify VRAM decrease
- Requirements: 5.2, 1.4
- 2.3 Write property test for VLM lifecycle
- Property 4: VRAM decrease on VLM unload
- Validates: Requirements 1.4
- 2.4 Write property test for blocking behavior
- Property 10: ensure_vlm_loaded blocking
- Property 11: ensure_vlm_unloaded blocking
- Validates: Requirements 5.1, 5.2
- 2.1 Implement ensure_vlm_loaded() in GPUResourceManager
-
3. Implement CLIP device management
- 3.1 Create core/gpu/clip_manager.py with CLIPManager class
- Implement device detection and migration
- Add pipeline reinitialization
- Requirements: 3.1, 3.3, 3.4
- 3.2 Implement migrate_clip_to_gpu() and migrate_clip_to_cpu()
- Check VRAM availability before GPU migration
- Handle migration failures gracefully
- Requirements: 3.1, 3.2, 3.4
- 3.3 Write property test for CLIP device
- Property 12: get_clip_device validity
- Validates: Requirements 5.3
- 3.4 Write property test for embedding consistency
- Property 7: Embedding pipeline consistency
- Validates: Requirements 3.3
- 3.1 Create core/gpu/clip_manager.py with CLIPManager class
-
4. Checkpoint - Ensure all tests pass
- Ensure all tests pass, ask the user if questions arise.
-
5. Implement execution mode management
- 5.1 Implement set_execution_mode() with automatic resource management
- AUTOPILOT: unload VLM, migrate CLIP to GPU
- RECORDING: load VLM, migrate CLIP to CPU
- IDLE: no automatic changes
- Requirements: 1.1, 1.2, 1.3, 3.1, 3.2
- 5.2 Implement mode transition coordination
- Ensure CLIP migrates before VLM loads
- Handle concurrent mode changes
- Requirements: 3.2, 5.4
- 5.3 Write property test for mode transitions
- Property 1: Mode transition triggers VLM unload
- Property 2: Mode transition triggers VLM load
- Validates: Requirements 1.1, 1.2
- 5.4 Write property test for CLIP in AUTOPILOT
- Property 3: CLIP on GPU in AUTOPILOT
- Validates: Requirements 1.3, 3.1
- 5.5 Write property test for migration ordering
- Property 6: CLIP migration ordering
- Validates: Requirements 3.2
- 5.1 Implement set_execution_mode() with automatic resource management
-
6. Implement idle timeout management
- 6.1 Add idle timeout tracking in GPUResourceManager
- Track last VLM request timestamp
- Implement background timer for timeout check
- Requirements: 4.1, 4.3
- 6.2 Implement on-demand VLM loading
- Intercept VLM requests when unloaded
- Load VLM before processing request
- Requirements: 4.2
- 6.3 Write property test for idle timeout
- Property 8: Idle timeout behavior
- Validates: Requirements 4.1, 4.3
- 6.4 Write property test for on-demand loading
- Property 9: On-demand VLM loading
- Validates: Requirements 4.2
- 6.1 Add idle timeout tracking in GPUResourceManager
-
7. Implement monitoring and events
- 7.1 Implement get_status() returning complete GPUResourceStatus
- Include all fields: vram, vlm_state, clip_device, execution_mode
- Requirements: 2.1
- 7.2 Implement event emission system
- resource_changed, mode_changed, idle_unload events
- VRAM change threshold detection (100 MB)
- Requirements: 2.2, 2.3, 4.4
- 7.3 Write property test for status completeness
- Property 5: Status query completeness
- Validates: Requirements 2.1
- 7.1 Implement get_status() returning complete GPUResourceStatus
-
8. Implement error handling and degraded mode
- 8.1 Implement graceful degradation for missing GPU
- Detect GPU availability at startup
- Force CPU-only mode if no GPU
- Requirements: 6.1
- 8.2 Implement Ollama unavailable handling
- Connection retry logic
- Degraded mode flag and reason
- Requirements: 1.5, 6.2, 6.3
- 8.3 Implement VRAM insufficient error handling
- Check VRAM before operations
- Return informative errors
- Requirements: 6.4
- 8.4 Write property test for sequential processing
- Property 13: Sequential operation processing
- Validates: Requirements 5.4
- 8.1 Implement graceful degradation for missing GPU
-
9. Checkpoint - Ensure all tests pass
- Ensure all tests pass, ask the user if questions arise.
-
10. Integration with existing components
- 10.1 Integrate GPUResourceManager with ExecutionLoop
- Call set_execution_mode() on mode changes
- Use ensure_vlm_loaded() before VLM operations
- Requirements: 1.1, 1.2, 4.2
- 10.2 Integrate with UIDetector
- Check VLM availability before classification
- Handle degraded mode gracefully
- Requirements: 1.5, 6.2
- 10.3 Integrate with FusionEngine/CLIP embedding
- Use CLIPManager for device-aware embeddings
- Reinitialize on device change
- Requirements: 3.3
- 10.4 Update core/config.py with GPU resource configuration
- Add GPUResourceConfig to AppConfig
- Support environment variables
- Requirements: 4.3
- 10.1 Integrate GPUResourceManager with ExecutionLoop
-
11. Final Checkpoint - Ensure all tests pass
- Ensure all tests pass, ask the user if questions arise.