KAHN Cloud: per-tenant /api/self ships

Headline

/api/self/tenant is a new auth-scoped endpoint that answers two operator questions the public /api/self cannot: “How many agent runs does my tenant have in storage?” and “Is my producer being rate-limited?”

What changed

The trust boundary is now explicit:

GET /api/self — public, no auth. Process-global liveness signal: uptime, last-event ages, archived-runs count. Consumed by runbook smoke probes that intentionally curl without a JWT (post-deploy uptime checks, external monitors).
GET /api/self/tenant — auth-scoped (airlock JWT in cloud, localhost sentinel in OSS). Per-tenant data: agent_archived_runs (real count, not a placeholder), rate_limit snapshot.

The prior placeholder agent_archived_runs: 0 on the public endpoint is gone. It was load-bearing dead weight that no SPA surface rendered.

The rate-limit snapshot

rate_limit carries:

capacity, refill_rate_per_sec — the bucket configuration
remaining_tokens — current bucket level (continuous-refill float)
recent_throttled_count — number of HTTP 429s this tenant received in the last 5 minutes
window_s — width of the throttled-event ring buffer

The snapshot is read-only. Repeated polls do NOT consume tokens. SPA dashboards refreshing every 5s would otherwise throttle the producer they’re monitoring; this property is a unit-tested invariant (“does not consume tokens under 50 polls”).

Why this matters

A producer asking “is my emitter throttled?” used to require either grepping logs or making the operator triage a 429 spike on backend metrics. Now it’s a JWT-authenticated curl that returns the answer:

curl -H "Authorization: Bearer $KAHN_LIVE_SK" \
  https://api.kahn.host/api/self/tenant | jq .rate_limit

recent_throttled_count: 0 means the tenant is steady-state. Non-zero means throttling is currently happening (or just happened). The lazy-prune-on-read pattern means the hot allow() path stays O(1) regardless of throttle volume.

Substrate work this enables

The dashboards on /#/agents and /#/audits consume real per-tenant data, not a placeholder. Building Phase B against the placeholder would have required a second pass — substrate-first sequencing prevented the rework. agent_archived_runs is now wired through both storage backends with parity-tested counts (count_agent_runs() on FS + Postgres). The trust-boundary split (public vs. tenant-scoped) is a pattern any future per-tenant introspection endpoint can copy.

What’s next

Per-tenant audit-policy webhooks (Phase E E1) consume this endpoint shape. Rate-limit visibility is a precondition for the per-(tenant_id, agent_id) bucket overlay (Phase D D4) — operators need to see the limit before they can argue for granularity.

​KAHN Cloud: per-tenant /api/self ships

​Headline

​What changed

​The rate-limit snapshot

​Why this matters

​Substrate work this enables

​What’s next