Skip to main content

KAHN Cloud v0.3 — ingest + tenant isolation shipped

Shipped: v0.3.0-ingest tag on kahn-hq/kahn main; live at https://kahn.host behind airlock SSO.

What this unlocks

KAHN went from “airlock-gated dashboard with no data” to “producers can POST orchestrator transitions and the owning user sees their run in the history view within seconds”. The product shape moved from demo to usable. Every multi-tenant surface on the KAHN side is now enforced at the database layer via Row-Level Security with a non-superuser app role. A bug that forgets to set tenant context returns zero rows, not someone else’s data.

The surface that’s now live

  • POST https://api.kahn.host/v1/ingest/transitions — NDJSON ingest, bearer-token-authenticated via kahn_live_sk_* keys.
  • scripts/provision-tenant.py — idempotent tenant + key provisioning against the admin DSN. --rotate for key rollover.
  • scripts/push-run.py — operator-side pusher, 1 MiB chunking, 429 exponential backoff, graph.json auto-embed.
  • kahn_app Postgres role with NOSUPERUSER NOBYPASSRLS so the RLS policies actually execute.
  • lookup_api_key() + lookup_tenant_by_org() SECURITY DEFINER helpers for pre-tenant-context lookups.

Why it took a full session

Three root-cause bugs surfaced only under a real production probe, not in tests:
  1. Idempotent re-push doubled DB rows. Fixed by content- aligned append with a refuse-to-rewrite guard.
  2. RLS was inert at runtime. Railway’s default postgres role bypasses every policy. Fixed with migration + env rotation.
  3. Every airlock JWT 401’d. python-jose silently doesn’t support EdDSA. Fixed by swapping to PyJWT[crypto].
None of the three would have been caught by unit tests; all three were caught by cross-tenant probes and a debug-page JWT decoder that fit on the final smoke-check budget.

Doctrine deltas captured

Four new doctrines, two prompt-ops, one campaign — all cross-linked from this learning. Any future cloud surface (choco, stratt, traceo) that goes behind airlock + RLS can now start from those rather than rediscover the landmines.

What’s next

  • Taskset 6d — CI bridge onboarding for choco/stratt/traceo. Reuses 100% of today’s ingest + provisioning code via the StorageBackend Protocol.
  • Airlock org_id claim — when available, KAHN migrates from user:<sub> tenancy to org:<id> multi-user tenancy with zero schema change (just an UPDATE on tenants.airlock_org_id).
  • Phase 3 billing — dormant; unlocked when three external prospects indicate paying intent. Until then KAHN Cloud runs as a single-plan beta with a 90-day retention, 100 tx/sec rate limit.

Business signal

KAHN Cloud is the first devarno-cloud product to ship a real multi-tenant SaaS topology with RLS-enforced isolation, not just middleware-level scoping. The playbook is now reusable — sister projects pick it up in days, not weeks.