KAHN Cloud: per-tenant /api/self ships
Headline
/api/self/tenant is a new auth-scoped endpoint that answers two operator questions the public /api/self cannot: “How many agent runs does my tenant have in storage?” and “Is my producer being rate-limited?”
What changed
The trust boundary is now explicit:GET /api/self— public, no auth. Process-global liveness signal: uptime, last-event ages, archived-runs count. Consumed by runbook smoke probes that intentionally curl without a JWT (post-deploy uptime checks, external monitors).GET /api/self/tenant— auth-scoped (airlock JWT in cloud, localhost sentinel in OSS). Per-tenant data:agent_archived_runs(real count, not a placeholder),rate_limitsnapshot.
agent_archived_runs: 0 on the public endpoint is gone. It was load-bearing dead weight that no SPA surface rendered.
The rate-limit snapshot
rate_limit carries:
capacity,refill_rate_per_sec— the bucket configurationremaining_tokens— current bucket level (continuous-refill float)recent_throttled_count— number of HTTP 429s this tenant received in the last 5 minuteswindow_s— width of the throttled-event ring buffer
Why this matters
A producer asking “is my emitter throttled?” used to require either grepping logs or making the operator triage a 429 spike on backend metrics. Now it’s a JWT-authenticated curl that returns the answer:recent_throttled_count: 0 means the tenant is steady-state. Non-zero means throttling is currently happening (or just happened). The lazy-prune-on-read pattern means the hot allow() path stays O(1) regardless of throttle volume.
Substrate work this enables
The dashboards on/#/agents and /#/audits consume real per-tenant data, not a placeholder. Building Phase B against the placeholder would have required a second pass — substrate-first sequencing prevented the rework.
agent_archived_runs is now wired through both storage backends with parity-tested counts (count_agent_runs() on FS + Postgres). The trust-boundary split (public vs. tenant-scoped) is a pattern any future per-tenant introspection endpoint can copy.
What’s next
Per-tenant audit-policy webhooks (Phase E E1) consume this endpoint shape. Rate-limit visibility is a precondition for the per-(tenant_id, agent_id) bucket overlay (Phase D D4) — operators need to see the limit before they can argue for granularity.