Dom
f2e9aac6b7
docs: add POC specs, handoffs, and research notes
2026-06-02 16:28:34 +02:00
Dom
18ed6cb751
feat(vwb): add dashboard competence testing and health tools
2026-06-02 16:27:19 +02:00
Dom
d38f0b0f2f
feat(agent): add learn action flow and grounding guards
2026-06-02 16:24:10 +02:00
Dom
86b3c8f7e7
feat(p1): persist workflows and semantic learning artifacts
2026-06-02 16:20:38 +02:00
Dom
7a1a5cb6fd
fix(p0): secure agent revocation and R6 worker queue
2026-06-02 15:52:35 +02:00
Dom
2dd306724c
docs(coordination): report no-cli competence test patch
2026-06-01 12:10:01 +02:00
Dom
335d576830
feat(dashboard): launch supervised competence tests
2026-06-01 12:09:09 +02:00
Dom
1a58a0d1f1
docs(coordination): sync dgx no-cli phase1 gaps
2026-06-01 11:59:27 +02:00
Dom
eb2df539f1
docs(poc): revise dgx spark dsi prerequisites docx
2026-06-01 11:04:16 +02:00
Dom
c9f848273b
docs(poc): add minimal dgx spark dsi prerequisites
2026-06-01 10:45:46 +02:00
Dom
45ec5fe969
docs(coordination): answer c gamma clarifications
2026-06-01 10:40:53 +02:00
Dom
8b6c397531
docs(poc): share dgx spark readiness context
2026-06-01 10:37:00 +02:00
Dom
6a300a4298
docs(coordination): add dgx spark multi-poste poc focus
2026-06-01 10:14:27 +02:00
Dom
0587036c17
docs(coordination): dispatch dgx spark poc readiness
2026-06-01 10:05:12 +02:00
Dom
f2a9e40502
docs(coordination): report c gamma dashboard promotion
2026-05-29 21:49:36 +02:00
Dom
34527b5cc5
feat(lea): add dashboard competence promotion dry run
2026-05-29 21:48:00 +02:00
Dom
bd3aaf7d64
docs(coordination): dispatch c gamma dashboard work
2026-05-29 19:04:58 +02:00
Dom
05a30f2d1d
docs(coordination): propose c gamma writeback decisions
2026-05-29 18:58:12 +02:00
Dom
47377226f2
feat(vwb): harden supervised verdict evidence
2026-05-29 18:54:54 +02:00
Dom
d515b22d1b
docs(coordination): report c beta supervision
2026-05-29 18:40:03 +02:00
Dom
aba849324a
feat(vwb): log supervised competence verdicts
2026-05-29 18:36:06 +02:00
Dom
7ad260d02f
docs(coordination): report c alpha preview
2026-05-29 18:15:30 +02:00
Dom
794a248dae
feat(vwb): preview lea competence workflows
2026-05-29 18:13:36 +02:00
Dom
8332b2cd37
docs(coordination): delegate yaml vwb supervision patch
2026-05-29 17:54:10 +02:00
Dom
9a45e61e2a
docs(coordination): report wait for state runtime
2026-05-29 17:26:35 +02:00
Dom
e66bc6d452
feat(vwb): execute wait for state
2026-05-29 17:22:35 +02:00
Dom
7b1f30af1a
fix(vwb): preserve static palette tools
2026-05-29 17:16:24 +02:00
Dom
488d14240a
docs(coordination): report vwb catalog patch
2026-05-29 17:11:02 +02:00
Dom
45b6da5e3f
feat(vwb): load palette from catalog
2026-05-29 17:09:47 +02:00
Dom
02211fddf2
docs(coordination): answer lea vwb mapping questions
2026-05-29 16:30:11 +02:00
Dom
ed36bc2b37
docs(coordination): share reflex vwb supervision findings
2026-05-29 14:33:57 +02:00
Dom
9677738f32
docs(coordination): request global review after vwb feedback
2026-05-29 14:05:40 +02:00
Dom
d422aa119c
docs(coordination): require claude qwen vision guardrails
2026-05-29 13:59:39 +02:00
Dom
7b943926db
docs(coordination): clarify vwb learning bridge
2026-05-29 13:46:22 +02:00
Dom
99f89317cb
feat(lea): substitute save menu gesture
2026-05-29 13:45:44 +02:00
Dom
6b8114eb97
docs(coordination): recadre lea direct competence flow
2026-05-29 13:41:18 +02:00
Dom
7ef98d8089
feat(lea): expose competence replay api
2026-05-29 13:40:15 +02:00
Dom
8ea4ed0ad2
docs(coordination): record supervised competence replay plan
2026-05-29 11:38:51 +02:00
Dom
a49f59b4d6
feat(competences): plan supervised replay tests
2026-05-29 11:38:12 +02:00
Dom
762e75a077
docs(coordination): record competence catalog integration
2026-05-29 11:29:18 +02:00
Dom
c1a144c673
feat(vwb): expose competence yaml catalog
2026-05-29 11:28:25 +02:00
Dom
e8a0fb0e42
feat(competences): extract batch candidates
2026-05-29 11:25:00 +02:00
Dom
4ba426c205
fix(replay): guard single in-flight dispatch
...
Add a private in-flight helper for replay dispatch, block machine retargeting while an action is still pending on the previous session, and warn on duplicate in-flight entries for the same replay triplet.
Freeze the Notepad runtime dialog success path and add integration coverage for single in-flight dispatch, watchdog late-report documentation, and the known concurrent-poll race as an xfail.
2026-05-25 11:00:59 +02:00
Dom
7bb8d543ab
feat(cognition): dataclasses Trace + SceneExpected + Precondition (Phase 2.1)
...
Crée les 3 dataclasses du modèle Mandat/Protocoles/Scènes v0.3 dans
core/cognition/, standalone (aucun branchement runtime), avec
sérialisation JSON explicite et tests offline.
Préparation des phases :
- Phase 2.1 plan : objet Trace (mandate_id, intention_id, scene_id,
affordance_signature, expected_retour, level_of_delegation)
- Workpack A : SceneExpected (monitor_index, app_name, title_patterns,
title_anti, window_rect_hint, scene_role, accepted_transitions,
stability_ms) + helper matches_title()
- Workpack B : Precondition (kind, window_title_must_contain/anti,
critic_question, verify_timeout_ms) + PreconditionRecovery
(max_attempts, on_recovery_fail, actions)
Toutes les dataclasses sont frozen, immutables, avec to_dict/from_dict
tolérants (champs vides/None -> instance vide). Validation au __post_init__
pour Precondition.kind et PreconditionRecovery.on_recovery_fail.
Aucune dépendance runtime obligatoire : si l'objet n'est pas posé sur
une action, fallback comportement actuel. Aucune modif executor /
api_stream / replay_engine / grounding.
Tests : 22/22 passent (sérialisation JSON, contrats from_dict tolérants,
validation kinds, helpers matches_title/check_title, anti-intention).
Tag rollback : rollback/pre-cognition-dataclasses-2026-05-25_0610
2026-05-25 06:08:18 +02:00
Dom
debd7b423c
feat(evaluation): add local Ollama LeaBench adapter
2026-05-24 21:58:06 +02:00
Dom
6544ebe3f0
feat(evaluation): add 16 LeaBench cases from replay failures
...
Extend LeaBench computer-use coverage with cases mined from
data/training/replay_failures/. Adds 8 distinct categories:
save_as visible, target absent (blank desktop / wrong window),
start button, start-menu search, task-view wrong state, systray
overflow, ambiguous tab labels, modal-blocker dialogs, and a
wrong-window Lea-terminal case.
- 16 new cases in benchmarks/computer_use/cases/leabench_extended_2026-05-24.jsonl
- 0 duplicate case_id vs notepad_replay_failures_2026-05-24.jsonl
- Validated with: python3 tools/lea_bench.py --cases ... --json
- pytest tests/unit/test_computer_use_bench.py: 7 passed
2026-05-24 21:57:24 +02:00
Dom
10136f0ee0
feat(agent): add standalone anchor-relative resolver
2026-05-24 21:54:39 +02:00
Dom
054279feb4
feat(evaluation): add LeaBench model prompt packs
2026-05-24 21:53:24 +02:00
Dom
ea1f57afb1
feat(evaluation): add LeaBench computer-use scorer
2026-05-24 21:21:17 +02:00
Dom
345762330b
fix(agent): respect server visual reject before text fallback
2026-05-24 21:10:42 +02:00