feat: replay visuel Windows opérationnel — template matching + VWB complet
- Bouton "Windows" dans VWB pour exécuter sur le PC distant - Template matching OpenCV multi-scale pour localiser les ancres visuelles - Proxy VWB→streaming server avec chargement ancre (thumb, pas full) - Fix executor Windows : mss lazy, result reporting, debug prints - Fix poll replay permanent (sans session active) - Mapping types VWB→executor (click_anchor→click, type_text→type) - CORS streaming server, capture Windows dans VWB - Dédup heartbeats côté client (hash perceptuel) - Mode cloud VLM configurable via RPA_VLM_MODEL - Fix resolve_target : pas de ScreenAnalyzer fallback (trop lent) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -321,10 +321,21 @@ Respond with just the role name, nothing else."""
|
||||
"confidence": 0.3, "success": True,
|
||||
}
|
||||
|
||||
prompt = """Classify this UI element. Reply with ONLY a JSON object.
|
||||
# Le system prompt contraint le thinking de qwen3-vl et réduit
|
||||
# drastiquement le nombre de tokens gaspillés en réflexion interne.
|
||||
# Sans system prompt, le modèle pense 500-800 tokens et épuise le budget.
|
||||
# Avec, il ne pense que 100-400 tokens et produit du JSON fiable.
|
||||
system_prompt = "You are a JSON-only UI classifier. No thinking. No explanation. Output raw JSON only."
|
||||
|
||||
prompt = """Classify this UI element. Reply with ONLY a JSON object, nothing else.
|
||||
|
||||
Types: button, text_input, checkbox, radio, dropdown, tab, link, icon, table_row, menu_item
|
||||
Roles: primary_action, cancel, submit, form_input, search_field, navigation, settings, close, delete, edit, save
|
||||
Example: {"type": "button", "role": "submit", "text": "OK"}
|
||||
|
||||
Example 1: {"type": "button", "role": "submit", "text": "OK"}
|
||||
Example 2: {"type": "text_input", "role": "form_input", "text": ""}
|
||||
Example 3: {"type": "icon", "role": "close", "text": "X"}
|
||||
|
||||
Your answer:"""
|
||||
|
||||
# Retry une fois si réponse vide
|
||||
@@ -332,8 +343,9 @@ Your answer:"""
|
||||
result = self.generate(
|
||||
prompt,
|
||||
image=element_image,
|
||||
system_prompt=system_prompt,
|
||||
temperature=0.1,
|
||||
max_tokens=200,
|
||||
max_tokens=300,
|
||||
force_json=False
|
||||
)
|
||||
|
||||
|
||||
Reference in New Issue
Block a user