chore(coordination+docs): watcher mandat AGENTS.md, recadrage POC CLAUDE.md, dette enrichie, loop script robustifié
Some checks failed
tests / Lint (ruff + black) (push) Failing after 1m49s
tests / Tests unitaires (sans GPU) (push) Failing after 1m53s
tests / Tests sécurité (critique) (push) Has been skipped

This commit is contained in:
Dom
2026-07-02 13:07:34 +02:00
parent 882e4e1f3a
commit 4cb173a8ec
6 changed files with 641 additions and 44 deletions

View File

@@ -10,3 +10,15 @@ Rules:
- If graphify-out/wiki/index.md exists, use it for broad navigation instead of raw source browsing.
- Read graphify-out/GRAPH_REPORT.md only for broad architecture review or when query/path/explain do not surface enough context.
- After modifying code, run `graphify update .` to keep the graph current (AST-only, no API cost).
## coordination watcher
At the beginning of every session, the coordination watcher is mandatory and must be operational for Codex, Claude, and Qwen before coordination work continues.
Session-start checklist:
- Run `docs/coordination/coordination_loop.sh ensure`.
- Read every pending message relevant to the current agent.
- After messages are processed, run `docs/coordination/coordination_loop.sh ack`.
- If the watcher cannot be started or checked, report that blocker immediately in the handoff/status response.
Every new handoff or restart prompt must include this watcher requirement by default.

View File

@@ -10,7 +10,9 @@ Tu n'es pas en autonomie. Dom valide avant chaque étape. Tu proposes, il décid
## Priorité absolue
**La démo Urgence_aiva_demo doit fonctionner.** Workflow 22+ steps sur Easily Assure, patiente MOREL Catherine, audience mixte DG/DSI/médecins/DIM/TIM. Tout arbitrage technique se tranche par : "est-ce que ça rapproche ou éloigne de la démo qui tourne ?"
**Le POC clinique Wallerstein doit tourner.** 5 postes Léa live ; les TIM travaillent sur leurs **vrais logiciels métier en mode web** (navigateur intégré au logiciel / navigateur du PC, instances **RDP** et **Citrix**), sur **2 écrans** → capture de la **fenêtre active**. Objectif produit : Léa **apprend** ces parcours et les **rejoue intelligemment** (pas du record-and-replay). Tout arbitrage technique se tranche par : « est-ce que ça rapproche ou éloigne du POC clinique qui tourne ? »
> Historique : `Urgence_aiva_demo` (22+ steps) sur la **maquette Easily Assure** (patiente fictive MOREL Catherine) était le banc de démo/test — **maquette abandonnée comme cible** (recadrage Dom 2026-06-25). Ne plus raisonner « Easily ».
## Méthode obligatoire — non négociable

View File

@@ -38,6 +38,8 @@ P0 / P1 / P2 / P3 (alignées sur convention handoffs)
| DETTE-020 | 2026-06-25 | 2026-07-09 | P1 | OPEN | **Incidents silencieux — aucune détection/alerte des composants critiques d'inférence.** Un composant critique peut tomber sans alerte : `rpa-vllm-grounder.service` (grounder Qwen3-VL/vLLM) trouvé en **crash-loop (auto-restart, restart counter ×3960)** → le runtime a basculé **silencieusement** sur le fallback `qwen2.5vl:7b-rpa` (Ollama, ~×7 plus lent), avec une latence/contention accrue mais **aucune remontée visible** (ni dashboard, ni log d'alerte). Découvert uniquement par vérif manuelle au runtime (session 2026-06-25). La cause de CE crash (SSL HuggingFace au boot vs cache local — manque `HF_HUB_OFFLINE`) se corrige à part ; la dette ici = **le mode dégradé est silencieux**. Cible : health-check + supervision des composants critiques (grounder vLLM, Ollama, services `rpa-*`) avec **remontée VISIBLE** (dashboard 5001 / log d'alerte / notification) → une bascule en mode dégradé ne doit jamais passer inaperçue. ⚠️ Vérifier d'abord l'existant (module monitoring `:5003`) avant de construire. | session vérif runtime DGX clinique 2026-06-25 |
| DETTE-021 | 2026-06-25 | 2026-07-09 | P1 | OPEN | **Journalisation client Léa non effective.** `LOG_FILE` (`agent_v0/agent_v1/config.py:88``<install>/logs/agent_v1.log`) est défini mais **jamais branché** : aucun `FileHandler`/`addHandler` dans tout le client. Seul logging actif = `basicConfig` (`main.py:46`) → **stderr**, perdu car Léa tourne en `pythonw.exe` (sans console). Dossier `logs/` vide. Conséquences : (1) **diagnostic terrain aveugle** — impossible de tracer pourquoi Léa « disparaît » côté poste ; (2) **non-conformité Règlement IA Art. 12** (journalisation + conservation 180 j — citée dans le code mais non effective ; `LOG_RETENTION_DAYS` ne couvre que les *sessions*). Cible : brancher un `RotatingFileHandler`/`TimedRotating` vers `LOG_FILE` (rotation + purge 180 j, niveau INFO). ⚠️ modif client → **redéploiement** (cf. DETTE-022). Pendant client du DETTE-020 (observabilité serveur). | session diagnostic « disparition » Léa poste Émilie 2026-06-25 |
| DETTE-022 | 2026-06-25 | 2026-07-09 | P1 | OPEN | **Pas de mise à jour automatique du client Léa.** Toute modif du client (`agent_v0/agent_v1/**`) impose un **redéploiement manuel poste par poste** (Léa « gelée »). En clinique (5 postes, croissant), intervenir sur chaque poste à chaque correctif (ex. fix logging DETTE-021) **dérange les TIM et décourage l'adoption** (constat Dom). Cible : mécanisme de **MAJ auto / en tâche de fond** (auto-update silencieux, versionné, piloté serveur/dashboard, avec rollback), **zéro intervention sur le poste**. ⚠️ Vérifier d'abord l'existant côté enrôlement Fleet (dashboard build ZIP + token) avant de construire. | décision Dom 2026-06-25 (« on ne peut pas intervenir constamment sur les postes, on va décourager ») |
| DETTE-023 | 2026-06-30 | 2026-07-14 | P1 | OPEN | **Validation post-action systématique non câblée au replay live.** `core/execution/action_executor.py` expose `verify_postconditions=True` (+ re-vérif/retry, l.187-242) mais le runtime live `replay_engine.py` **n'importe pas `ActionExecutor`** (seulement `LLMActionHandler`, l.2497) → la vérif de post-condition après CHAQUE action est **écrite-non-wired**. Le replay live ne valide qu'à **gros grain** : `precheck` de similarité d'écran avant action (≥ 0.85, replay_engine.py:2844) + `verify_screen` **entre GROUPES** d'actions (l.39), pas après chaque clic. Lié à DETTE-008 (pre-check VLM par-clic désactivé `if False:`, observe_reason_act.py:1704) et DETTE-001 (pré-check OCR spatialement aveugle). **Enjeu produit** (décision Dom 2026-06-30 : « vision = validateur des actions ET de l'apprentissage », pour ZÉRO erreur en récupération de dossiers et scaling multi-VM/postes) : densifier la validation visuelle aux points critiques (login, ouverture dossier, lecture écran→JSON) **ou** rebrancher la vérif post-condition au replay live. ⚠️ Vérifier d'abord l'existant (`verify_screen`, `ActionExecutor`, ORALoop) avant de construire. | session 30/06 trace runtime (replay_engine n'utilise pas ActionExecutor) + décision Dom VM/vision 2026-06-30 |
| DETTE-024 | 2026-06-30 | 2026-07-14 | P1 | OPEN | **Le dashboard fleet `/api/fleet/download/<machine_id>` sert un ZIP NON autoportant.** Test 30/06 : le download a renvoyé un ZIP de **210 Ko** (sans `python-3.12-embed`) au lieu du `Lea_full_v1.0.1.zip` (33 Mo) pourtant déposé dans `deploy/build/` → le dashboard lit le **fallback** `deploy/Lea_v1.0.0.zip` (ou un chemin relatif au cwd, cf. DETTE-015) et NON le full. Conséquence : un poste enrôlé via le dashboard recevrait un exe **non installable** (pas de Python embarqué). Contourné manuellement pour Émilie (ZIP full local + `config.txt` du download + flag). Cible : le download doit servir le **full autoportant à jour** (chemin absolu, pas de fallback silencieux). ⚠️ Bloquant pour s'appuyer sur le dashboard au déploiement GPO/multi-postes. | session livraison exe Émilie 2026-06-30 (web_dashboard/app.py:2379) |
## Convention de référencement

View File

@@ -27,6 +27,9 @@ Spécification complète pour l'implémentation :
### Autres Documents
- **`ROADMAP_RPA_100_VISION.md`** - Vision et roadmap du projet
- **`INSTALLATION_MULTI_SITE.md`** - Guide installation POC/MVP/production et multi-etablissement
- **`PLAN_ACTION_SUITE_2026-06-23.md`** - Plan d'action consolidé post-livraison clinique (chapeaute les plans existants ; axe central = rejeu intelligent des actions apprises)
- **`PLAN_REMISE_AU_CARRE_APPRENTISSAGE_2026-06-27.md`** - Remise au carré de la chaîne apprentissage/rejeu : pourquoi elle n'est pas câblée (vérifié) + plan d'exécution Phase 0 (mesure) → R1-R6, contrainte « Léa correcte avant la dernière manip manuelle »
## 🎯 Par Où Commencer ?

View File

@@ -61,6 +61,46 @@ résultats de tests.
Même règle en sens inverse si Claude initie la demande.
## Surveillance automatique
`coordination_loop.sh` surveille les inbox et cree un declencheur persistant a
chaque nouveau message detecte.
Cette surveillance est obligatoire au debut de chaque session pour Codex,
Claude et Qwen. Aucun handoff ne doit omettre ce pre-check.
Pre-check debut de session :
1. `docs/coordination/coordination_loop.sh ensure`
2. Lire les messages pertinents pour l'agent courant.
3. Apres traitement : `docs/coordination/coordination_loop.sh ack`
Si le watcher ne peut pas etre lance ou verifie, c'est un blocage de reprise a
signaler explicitement.
Commandes utiles :
- `docs/coordination/coordination_loop.sh ensure` : lance si besoin, scanne, affiche pending.
- `docs/coordination/coordination_loop.sh start 15` : demarre la surveillance.
- `docs/coordination/coordination_loop.sh service-install` : installe/met a jour et redemarre le watcher systemd utilisateur persistant.
- `docs/coordination/coordination_loop.sh service-stop` : arrete et desactive le watcher systemd utilisateur.
- `docs/coordination/coordination_loop.sh status` : etat, compteurs et file unread.
- `docs/coordination/coordination_loop.sh pending` : messages detectes non ACK localement.
- `docs/coordination/coordination_loop.sh ack` : vide la file unread locale.
- `docs/coordination/coordination_loop.sh events` : derniers evenements detectes.
Artefacts crees :
- `.loop_state/unread_messages.tsv` : file des messages a traiter.
- `.loop_state/unread_digest.md` : digest lisible au debut de session.
- `.loop_state/latest_message.trigger` : dernier declencheur.
- `.loop_state/message_events.tsv` : journal evenements machine-readable.
- `.loop_state/triggers/*.trigger` : un fichier declencheur par message.
Un hook externe peut etre branche avec `COORD_LOOP_TRIGGER_CMD`. Le hook recoit
`COORD_MESSAGE_DIR`, `COORD_MESSAGE_FILE`, `COORD_MESSAGE_PATH`,
`COORD_MESSAGE_STATUS` et `COORD_TRIGGER_FILE`.
## Règle de capitalisation
Un message de coordination est un flux. Une synthèse ou un registre est une

View File

@@ -1,54 +1,592 @@
#!/bin/bash
# Coordination inbox loop v3 — compare par nom de fichiers
#!/usr/bin/env bash
# Coordination inbox loop v4.
#
# One-shot by default:
# docs/coordination/coordination_loop.sh once
#
# Long-running foreground loop:
# docs/coordination/coordination_loop.sh watch 15
#
# Background loop:
# docs/coordination/coordination_loop.sh start 15
#
# Trigger files:
# docs/coordination/.loop_state/unread_messages.tsv
# docs/coordination/.loop_state/unread_digest.md
# docs/coordination/.loop_state/latest_message.trigger
# docs/coordination/.loop_state/message_events.tsv
COORD_DIR="/home/dom/ai/rpa_vision_v3/docs/coordination"
LOG="/home/dom/ai/rpa_vision_v3/docs/coordination/.loop_log.txt"
TMP="/tmp/coord_loop"
mkdir -p "$TMP"
set -euo pipefail
NEW_FOUND=0
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
SCRIPT_PATH="${SCRIPT_DIR}/$(basename "${BASH_SOURCE[0]}")"
check_inbox() {
local inbox_name="$1"
local baseline_file="$TMP/baseline_${inbox_name}.txt"
local inbox_path="${COORD_DIR}/${inbox_name}"
local current_file="$TMP/current_${inbox_name}.txt"
COORD_DIR="${COORD_DIR:-$SCRIPT_DIR}"
LOG="${COORD_LOOP_LOG:-$COORD_DIR/.loop_log.txt}"
SUMMARY="${COORD_LOOP_BASELINE:-$COORD_DIR/.inbox_baseline.txt}"
STATE_DIR="${COORD_LOOP_STATE_DIR:-$COORD_DIR/.loop_state}"
PID_FILE="${COORD_LOOP_PID_FILE:-$STATE_DIR/coordination_loop.pid}"
OUT_FILE="${COORD_LOOP_OUT:-$STATE_DIR/coordination_loop.out}"
DEFAULT_INTERVAL="${COORD_LOOP_INTERVAL:-15}"
EVENTS_FILE="${COORD_LOOP_EVENTS_FILE:-$STATE_DIR/message_events.tsv}"
PENDING_FILE="${COORD_LOOP_PENDING_FILE:-$STATE_DIR/unread_messages.tsv}"
DIGEST_FILE="${COORD_LOOP_DIGEST_FILE:-$STATE_DIR/unread_digest.md}"
LATEST_TRIGGER="${COORD_LOOP_LATEST_TRIGGER:-$STATE_DIR/latest_message.trigger}"
TRIGGER_DIR="${COORD_LOOP_TRIGGER_DIR:-$STATE_DIR/triggers}"
TRIGGER_CMD="${COORD_LOOP_TRIGGER_CMD:-}"
DESKTOP_NOTIFY="${COORD_LOOP_DESKTOP_NOTIFY:-1}"
SYSTEMD_UNIT_NAME="${COORD_LOOP_SYSTEMD_UNIT:-rpa-coordination-watcher.service}"
ls "$inbox_path" 2>/dev/null | sort > "$current_file"
if [[ -n "${COORD_LOOP_DIRS:-}" ]]; then
# shellcheck disable=SC2206
WATCH_DIRS=($COORD_LOOP_DIRS)
else
WATCH_DIRS=(inbox_qwen inbox_codex inbox_claude active)
fi
if [ ! -f "$baseline_file" ]; then
cp "$current_file" "$baseline_file"
DRY_RUN=0
ARGS=()
for arg in "$@"; do
case "$arg" in
--dry-run) DRY_RUN=1 ;;
-h|--help) ARGS+=("help") ;;
*) ARGS+=("$arg") ;;
esac
done
set -- "${ARGS[@]}"
usage() {
cat <<EOF
Usage: $(basename "$0") [command] [interval_seconds] [--dry-run]
Commands:
once Scan once and update the persistent baseline (default).
watch Scan forever in the foreground.
start Start watch mode in the background.
ensure Start if needed, scan once, then show pending messages.
stop Stop the background loop.
status Show background loop status and current counters.
pending Show unread coordination messages detected by the loop.
ack Mark detected coordination messages as read locally.
events Show recent message trigger events.
service-install Install/update and restart the user systemd watcher service.
service-stop Stop and disable the user systemd watcher service.
service-status Show the user systemd watcher service status.
baseline Reset the persistent baseline to the current files.
tail Tail the loop log.
Environment:
COORD_LOOP_DIRS="inbox_qwen inbox_codex inbox_claude active"
COORD_LOOP_INTERVAL=15
COORD_LOOP_TRIGGER_CMD='command to run for each new message'
COORD_LOOP_DESKTOP_NOTIFY=1
EOF
}
ensure_state_dir() {
if [[ "$DRY_RUN" -eq 0 ]]; then
mkdir -p "$STATE_DIR"
fi
}
timestamp_human() {
date '+%Y-%m-%d %H:%M'
}
timestamp_file() {
date '+%Y-%m-%d_%H%M'
}
state_file_for() {
local dir_name="$1"
printf '%s/%s.files' "$STATE_DIR" "$dir_name"
}
current_file_for() {
local dir_name="$1"
printf '%s/%s.current' "$STATE_DIR" "$dir_name"
}
list_files() {
local dir_name="$1"
local dir_path="$COORD_DIR/$dir_name"
if [[ ! -d "$dir_path" ]]; then
return 0
fi
find "$dir_path" -maxdepth 1 -type f ! -name '.*' -printf '%f\n' | LC_ALL=C sort -u
}
summary_epoch() {
if [[ ! -f "$SUMMARY" ]]; then
return 1
fi
local ts
ts="$(awk -F: '$1 == "timestamp" {print $2}' "$SUMMARY" | tail -n 1)"
if [[ -z "$ts" ]]; then
return 1
fi
date -d "${ts/_/ }" '+%s' 2>/dev/null
}
bootstrap_baseline_from_summary() {
local dir_name="$1"
local baseline_file="$2"
local dir_path="$COORD_DIR/$dir_name"
local epoch
epoch="$(summary_epoch)" || return 1
[[ -d "$dir_path" ]] || return 1
find "$dir_path" -maxdepth 1 -type f ! -name '.*' -printf '%T@ %f\n' \
| awk -v cutoff="$epoch" '$1 <= cutoff {sub(/^[^ ]+ /, ""); print}' \
| LC_ALL=C sort -u > "$baseline_file"
}
count_files() {
local dir_name="$1"
list_files "$dir_name" | wc -l | tr -d ' '
}
extract_status() {
local file_path="$1"
grep -m1 -E '(^[[:space:]-]*`?Statut`?[[:space:]]*:|^\*\*Statut[^*]*\*\*[[:space:]]*:)' "$file_path" 2>/dev/null \
| sed 's/[[:space:]]*$//' || true
}
pending_count() {
if [[ -f "$PENDING_FILE" ]]; then
wc -l < "$PENDING_FILE" | tr -d ' '
else
printf '0'
fi
}
write_pending_digest() {
[[ "$DRY_RUN" -eq 1 ]] && return 0
ensure_state_dir
local count
count="$(pending_count)"
{
printf '# Coordination unread digest\n\n'
printf -- '- `Updated`: %s\n' "$(date --iso-8601=seconds)"
printf -- '- `Pending`: %s\n\n' "$count"
if [[ "$count" == "0" || ! -s "$PENDING_FILE" ]]; then
printf 'No pending coordination messages.\n'
return 0
fi
printf '## Pending messages\n\n'
while IFS=$'\t' read -r ts dir_name file_name file_path _rest; do
[[ -z "${file_path:-}" ]] && continue
printf -- '- `%s` `%s` `%s`\n' "$ts" "$dir_name" "$file_name"
printf ' - path: `%s`\n' "$file_path"
if [[ -f "$file_path" ]]; then
local title
local status_line
title="$(sed -n '1p' "$file_path" | sed 's/[[:space:]]*$//')"
status_line="$(extract_status "$file_path")"
[[ -n "$title" ]] && printf ' - title: %s\n' "$title"
[[ -n "$status_line" ]] && printf ' - status: %s\n' "$status_line"
fi
done < "$PENDING_FILE"
printf '\n## Commands\n\n'
printf -- '- Read pending: `docs/coordination/coordination_loop.sh pending`\n'
printf -- '- Ack after processing: `docs/coordination/coordination_loop.sh ack`\n'
} > "$DIGEST_FILE"
}
safe_fragment() {
printf '%s' "$1" | tr -c 'A-Za-z0-9._=-' '_' | cut -c 1-180
}
record_message_event() {
local dir_name="$1"
local dir_path="$2"
local file_name="$3"
local status_line="$4"
[[ "$DRY_RUN" -eq 1 ]] && return 0
mkdir -p "$TRIGGER_DIR"
local ts_iso
local ts_file
local safe_file
local file_path
local trigger_file
local status_clean
ts_iso="$(date --iso-8601=seconds)"
ts_file="$(date '+%Y%m%dT%H%M%S')"
safe_file="$(safe_fragment "$file_name")"
file_path="$dir_path/$file_name"
trigger_file="$TRIGGER_DIR/${ts_file}_${dir_name}_${safe_file}.trigger"
status_clean="${status_line//$'\t'/ }"
status_clean="${status_clean//$'\n'/ }"
{
printf 'timestamp=%s\n' "$ts_iso"
printf 'dir=%s\n' "$dir_name"
printf 'file=%s\n' "$file_name"
printf 'path=%s\n' "$file_path"
printf 'status=%s\n' "$status_clean"
} > "$trigger_file"
cp "$trigger_file" "$LATEST_TRIGGER"
printf '%s\t%s\t%s\t%s\t%s\n' "$ts_iso" "$dir_name" "$file_name" "$file_path" "$status_clean" >> "$EVENTS_FILE"
printf '%s\t%s\t%s\t%s\n' "$ts_iso" "$dir_name" "$file_name" "$file_path" >> "$PENDING_FILE"
write_pending_digest
if [[ "$DESKTOP_NOTIFY" == "1" ]] && command -v notify-send >/dev/null 2>&1; then
notify-send "Coordination: nouveau message" "${dir_name}/${file_name}" >/dev/null 2>&1 || true
fi
if [[ -n "$TRIGGER_CMD" ]]; then
(
export COORD_MESSAGE_TIMESTAMP="$ts_iso"
export COORD_MESSAGE_DIR="$dir_name"
export COORD_MESSAGE_FILE="$file_name"
export COORD_MESSAGE_PATH="$file_path"
export COORD_MESSAGE_STATUS="$status_clean"
export COORD_TRIGGER_FILE="$trigger_file"
bash -lc "$TRIGGER_CMD"
) >> "$OUT_FILE" 2>&1 || true &
fi
}
write_summary() {
local tmp_summary="$STATE_DIR/inbox_baseline.tmp"
if [[ "$DRY_RUN" -eq 1 ]]; then
for dir_name in "${WATCH_DIRS[@]}"; do
printf '%s:%s\n' "$dir_name" "$(count_files "$dir_name")"
done
printf 'timestamp:%s\n' "$(timestamp_file)"
return
fi
local new_files
new_files=$(grep -Fxvf "$baseline_file" "$current_file" 2>/dev/null)
if [ -n "$new_files" ]; then
NEW_FOUND=1
local count
count=$(echo "$new_files" | wc -l)
echo "[$(date '+%Y-%m-%d %H:%M')] 📥 ${inbox_name}: +${count} nouveau(x) message(s)" >> "$LOG"
echo "$new_files" | while read -r f; do
echo "$f" >> "$LOG"
local statut
statut=$(grep -m1 'Statut' "${inbox_path}/${f}" 2>/dev/null || echo "")
if [ -n "$statut" ]; then
echo " ${statut}" >> "$LOG"
fi
done
echo "" >> "$LOG"
fi
cp "$current_file" "$baseline_file"
: > "$tmp_summary"
for dir_name in "${WATCH_DIRS[@]}"; do
printf '%s:%s\n' "$dir_name" "$(count_files "$dir_name")" >> "$tmp_summary"
done
printf 'timestamp:%s\n' "$(timestamp_file)" >> "$tmp_summary"
mv "$tmp_summary" "$SUMMARY"
}
check_inbox "inbox_qwen"
check_inbox "inbox_codex"
check_inbox "inbox_claude"
log_line() {
local line="$1"
if [[ "$DRY_RUN" -eq 1 ]]; then
printf '%s\n' "$line"
else
printf '%s\n' "$line" >> "$LOG"
fi
}
if [ "$NEW_FOUND" -eq 1 ]; then
echo "📥 Nouveau message coordination détecté — voir $LOG"
else
echo "❤️ loop OK $(date '+%H:%M')"
fi
reset_baseline() {
ensure_state_dir
local scan_lock_fd
exec {scan_lock_fd}>"$STATE_DIR/scan.lock"
flock "$scan_lock_fd"
for dir_name in "${WATCH_DIRS[@]}"; do
list_files "$dir_name" > "$(state_file_for "$dir_name")"
done
write_summary
write_pending_digest
log_line "[$(timestamp_human)] coordination loop baseline reset"
flock -u "$scan_lock_fd"
exec {scan_lock_fd}>&-
printf 'Baseline coordination initialisee: %s\n' "$SUMMARY"
}
scan_once() {
ensure_state_dir
local scan_lock_fd
exec {scan_lock_fd}>"$STATE_DIR/scan.lock"
flock "$scan_lock_fd"
local new_found=0
local initialized=0
for dir_name in "${WATCH_DIRS[@]}"; do
local dir_path="$COORD_DIR/$dir_name"
local baseline_file
local current_file
local temp_baseline=0
baseline_file="$(state_file_for "$dir_name")"
current_file="$(current_file_for "$dir_name")"
if [[ ! -d "$dir_path" ]]; then
continue
fi
if [[ "$DRY_RUN" -eq 1 ]]; then
current_file="$(mktemp)"
list_files "$dir_name" > "$current_file"
else
list_files "$dir_name" > "$current_file"
fi
if [[ ! -f "$baseline_file" ]]; then
if [[ "$DRY_RUN" -eq 1 ]]; then
baseline_file="$(mktemp)"
temp_baseline=1
if ! bootstrap_baseline_from_summary "$dir_name" "$baseline_file"; then
initialized=1
cp "$current_file" "$baseline_file"
fi
else
if ! bootstrap_baseline_from_summary "$dir_name" "$baseline_file"; then
initialized=1
cp "$current_file" "$baseline_file"
fi
fi
fi
LC_ALL=C sort -u "$baseline_file" -o "$baseline_file"
local new_files
new_files="$(LC_ALL=C comm -13 "$baseline_file" "$current_file" || true)"
if [[ -n "$new_files" ]]; then
new_found=1
local count
count="$(printf '%s\n' "$new_files" | wc -l | tr -d ' ')"
log_line "[$(timestamp_human)] 📥 ${dir_name}: +${count} nouveau(x) message(s)"
while IFS= read -r file_name; do
[[ -z "$file_name" ]] && continue
log_line "$file_name"
local status_line
status_line="$(extract_status "$dir_path/$file_name")"
if [[ -n "$status_line" ]]; then
log_line " ${status_line}"
fi
record_message_event "$dir_name" "$dir_path" "$file_name" "$status_line"
done <<< "$new_files"
log_line ""
fi
if [[ "$DRY_RUN" -eq 0 ]]; then
cp "$current_file" "$baseline_file"
else
rm -f "$current_file"
fi
[[ "$temp_baseline" -eq 1 ]] && rm -f "$baseline_file"
done
write_summary
local rc=0
if [[ "$new_found" -eq 1 ]]; then
printf 'Nouveau message coordination detecte - voir %s\n' "$LOG"
rc=2
elif [[ "$initialized" -eq 1 ]]; then
printf 'Baseline coordination initialisee - aucun ancien message rejoue\n'
else
printf 'loop OK %s\n' "$(date '+%H:%M')"
fi
flock -u "$scan_lock_fd"
exec {scan_lock_fd}>&-
return "$rc"
}
watch_loop() {
local interval="${1:-$DEFAULT_INTERVAL}"
ensure_state_dir
printf '%s\n' "$$" > "$PID_FILE"
trap 'if [[ -f "'"$PID_FILE"'" ]] && [[ "$(cat "'"$PID_FILE"'")" == "'"$$"'" ]]; then rm -f "'"$PID_FILE"'"; fi' EXIT INT TERM
log_line "=== Coordination loop started $(timestamp_human), interval=${interval}s ==="
while true; do
scan_once || true
sleep "$interval"
done
}
is_running() {
[[ -f "$PID_FILE" ]] && kill -0 "$(cat "$PID_FILE")" 2>/dev/null
}
start_loop() {
local interval="${1:-$DEFAULT_INTERVAL}"
ensure_state_dir
if is_running; then
printf 'Coordination loop deja actif: pid=%s\n' "$(cat "$PID_FILE")"
return 0
fi
rm -f "$PID_FILE"
if command -v setsid >/dev/null 2>&1; then
setsid bash -c '
pid_file="$1"
script_path="$2"
interval="$3"
out_file="$4"
printf "%s\n" "$$" > "$pid_file"
exec "$script_path" watch "$interval" >> "$out_file" 2>&1 < /dev/null
' _ "$PID_FILE" "$SCRIPT_PATH" "$interval" "$OUT_FILE" &
else
nohup bash -c '
pid_file="$1"
script_path="$2"
interval="$3"
out_file="$4"
printf "%s\n" "$$" > "$pid_file"
exec "$script_path" watch "$interval" >> "$out_file" 2>&1 < /dev/null
' _ "$PID_FILE" "$SCRIPT_PATH" "$interval" "$OUT_FILE" >/dev/null 2>&1 &
fi
local launcher_pid=$!
local pid=""
for _ in 1 2 3 4 5; do
if [[ -f "$PID_FILE" ]]; then
pid="$(cat "$PID_FILE")"
break
fi
sleep 0.1
done
if [[ -z "$pid" ]]; then
pid="$launcher_pid"
printf '%s\n' "$pid" > "$PID_FILE"
fi
printf 'Coordination loop demarre: pid=%s interval=%ss\n' "$pid" "$interval"
printf 'Log: %s\n' "$LOG"
}
ensure_loop() {
local interval="${1:-$DEFAULT_INTERVAL}"
if ! is_running; then
start_loop "$interval"
fi
scan_once || true
show_status
show_pending
}
stop_loop() {
if command -v systemctl >/dev/null 2>&1 \
&& systemctl --user is-active --quiet "$SYSTEMD_UNIT_NAME" 2>/dev/null; then
systemctl --user stop "$SYSTEMD_UNIT_NAME" || true
rm -f "$PID_FILE"
printf 'Service watcher arrete: %s\n' "$SYSTEMD_UNIT_NAME"
return 0
fi
if ! is_running; then
printf 'Coordination loop inactif\n'
rm -f "$PID_FILE"
return 0
fi
local pid
pid="$(cat "$PID_FILE")"
kill "$pid"
rm -f "$PID_FILE"
printf 'Coordination loop arrete: pid=%s\n' "$pid"
}
show_status() {
if is_running; then
printf 'Coordination loop: actif pid=%s\n' "$(cat "$PID_FILE")"
else
printf 'Coordination loop: inactif\n'
fi
printf 'Dirs: %s\n' "${WATCH_DIRS[*]}"
for dir_name in "${WATCH_DIRS[@]}"; do
printf '%s:%s\n' "$dir_name" "$(count_files "$dir_name")"
done
[[ -f "$SUMMARY" ]] && printf 'Baseline: %s\n' "$SUMMARY"
[[ -f "$LOG" ]] && printf 'Log: %s\n' "$LOG"
printf 'Unread trigger queue: %s (%s pending)\n' "$PENDING_FILE" "$(pending_count)"
printf 'Unread digest: %s\n' "$DIGEST_FILE"
[[ -f "$LATEST_TRIGGER" ]] && printf 'Latest trigger: %s\n' "$LATEST_TRIGGER"
[[ -n "$TRIGGER_CMD" ]] && printf 'Trigger cmd: configured\n'
return 0
}
show_pending() {
if [[ ! -s "$PENDING_FILE" ]]; then
printf 'Aucun message coordination en attente dans %s\n' "$PENDING_FILE"
return 0
fi
cat "$PENDING_FILE"
}
ack_pending() {
ensure_state_dir
local scan_lock_fd
exec {scan_lock_fd}>"$STATE_DIR/scan.lock"
flock "$scan_lock_fd"
: > "$PENDING_FILE"
write_pending_digest
log_line "[$(timestamp_human)] unread coordination trigger queue acked"
flock -u "$scan_lock_fd"
exec {scan_lock_fd}>&-
printf 'Messages coordination marques lus localement: %s\n' "$PENDING_FILE"
}
show_events() {
if [[ ! -s "$EVENTS_FILE" ]]; then
printf 'Aucun evenement coordination dans %s\n' "$EVENTS_FILE"
return 0
fi
tail -n "${1:-40}" "$EVENTS_FILE"
}
install_user_service() {
local user_dir="${XDG_CONFIG_HOME:-$HOME/.config}/systemd/user"
local unit_path="$user_dir/$SYSTEMD_UNIT_NAME"
local template_path="$COORD_DIR/systemd/$SYSTEMD_UNIT_NAME"
if [[ ! -f "$template_path" ]]; then
printf 'Template systemd introuvable: %s\n' "$template_path" >&2
return 1
fi
mkdir -p "$user_dir"
install -m 0644 "$template_path" "$unit_path"
systemctl --user daemon-reload
systemctl --user enable "$SYSTEMD_UNIT_NAME"
systemctl --user restart "$SYSTEMD_UNIT_NAME"
printf 'Service watcher installe/mis a jour et redemarre: %s\n' "$unit_path"
systemctl --user --no-pager --full status "$SYSTEMD_UNIT_NAME" || true
}
stop_user_service() {
systemctl --user disable --now "$SYSTEMD_UNIT_NAME" || true
rm -f "$PID_FILE"
printf 'Service watcher desactive et arrete: %s\n' "$SYSTEMD_UNIT_NAME"
}
show_service_status() {
systemctl --user --no-pager --full status "$SYSTEMD_UNIT_NAME" || true
}
cmd="${1:-once}"
case "$cmd" in
once) scan_once ;;
watch) watch_loop "${2:-$DEFAULT_INTERVAL}" ;;
start) start_loop "${2:-$DEFAULT_INTERVAL}" ;;
ensure) ensure_loop "${2:-$DEFAULT_INTERVAL}" ;;
stop) stop_loop ;;
status) show_status ;;
pending) show_pending ;;
ack) ack_pending ;;
events) show_events "${2:-40}" ;;
service-install) install_user_service ;;
service-stop) stop_user_service ;;
service-status) show_service_status ;;
baseline) reset_baseline ;;
tail) tail -n "${2:-80}" "$LOG" ;;
help) usage ;;
*)
usage >&2
exit 64
;;
esac