KAHN Scope: per-agent dashboard ships
Headline
/#/agents is no longer a row-per-run table. It’s a row-per-agent dashboard — busiest first, outcome histogram per row, p50/p95 duration, mean convergence, sparkline trend, last-seen relative timestamp. The operator’s eye lands on the right thing in one render.
What changed
The Phase A surface answered “what runs ran?” The Phase B dashboard answers questions Phase A couldn’t:- Which agent is busiest? Rows ordered by
run_countdescending. - Which agent is failing? Outcome histogram (5-segment stacked bar, scaled across rows so cross-row comparison is honest).
- How fast are agents converging? p50/p95 duration columns; linear-interpolation percentiles, byte-equivalent across FS+Postgres backends.
- Is convergence trending up? Sparkline column with fixed [0,1] y-axis (cross-row comparable), endpoint dot coloured by the convergence-badge bucket of the latest score.
- When was this agent last active? Relative-time
last_seen.
The window selector
24h | 7d | all toggles the rollup window. Unknown values 400 with a documented allow-list — operators cannot typo new windows into existence.
Why this matters for pilots
The kahn.host external-pilot conversation through Phase A was “we can show you a list of CI runs with agent labels.” Phase B converts it into “we can show you which of your agents is currently degrading, before your customers tell you.” That’s the kahn.tools pitch made concrete.What’s next
The dashboard is the substrate for two follow-on surfaces: cross-run audit flakiness (/#/audits, also shipped Phase B) and a future per-agent reliability heatmap (north-star Phase C deliverable). The aggregation endpoint serving the dashboard — /api/agent-runs/aggregate — is wire-shape-pinned in kahn-hq/docs/spec/agent-runs-aggregate.md and ready for SDK consumption.
The landing-page rewrite at kahn.tools can now anchor on real screenshots of the dashboard, not placeholder copy. Specifically: a screenshot of the audit-runner row with its 6-event rollup (3 audits × pass+fail) is in tree as a check-in fixture and renders deterministically.