feat(evaluation): add local Ollama LeaBench adapter

This commit is contained in:
Dom
2026-05-24 21:58:06 +02:00
parent 6544ebe3f0
commit debd7b423c
4 changed files with 498 additions and 0 deletions

View File

@@ -59,6 +59,16 @@ python3 tools/lea_bench.py \
--json
```
Produire des predictions avec Ollama local :
```bash
python3 tools/lea_bench_ollama.py \
--cases benchmarks/computer_use/cases/notepad_replay_failures_2026-05-24.jsonl \
--repo-root . \
--model qwen2.5vl:7b-rpa \
--output benchmarks/computer_use/predictions/qwen25vl_notepad.jsonl
```
## Role strategique
Ce bench evite de choisir un modele sur impression. On mesure :