Gated Tasksets Prevent Rollback Cascades — 12 Commits, Zero Rollbacks

What happened

The clari-tools → oompa migration on 17 April 2026 executed eight tasksets in one session, producing 12 commits across proto schemas, NATS streams, brand guardrails, observability (rules + alerts + dashboard), Python service scaffolds, docker-compose wiring, oompa-spec additions, editor component renderers, Gherkin acceptance tests, a full Next.js 16 landing page rebuild, and cross-cutting final validation. Every taskset ended at a named, binary gate:

Taskset	Gate
1 — proto + NATS	Proto structural validity + `yaml.safe_load_all` on streams.yaml
2 — guardrails + observability	`promtool check rules` + `jq -e` on dashboards + yaml parse
3 — service scaffolds + compose	`docker compose config -q` + duplicate-port scan returns empty
4 — oompa spec + editor blocks	`pnpm type-check` in cho-co-web + spec YAML validity
5 — gherkin tests	`pnpm test:dry --tags "@REQ-OMP-*"` reports zero ambiguous/undefined
6 — landing structure	`pnpm build` passes, zero `gsap` imports remain
7 — landing interactive	`pnpm build` plus live `curl -X POST /api/waitlist`
8 — final validation	All of the above, end-to-end, plus forbidden-word scan

Zero commits were reverted. Zero fix commits were needed. The only “adjustment” commit (the data/ gitignore for the dev waitlist) was a small follow-on within the same taskset, not a rollback.

Why gates work

Each gate is a compile-time statement, not a judgment call. promtool check rules either passes or it doesn’t. docker compose config -q either parses or it doesn’t. The gate is run immediately on the staged changes, before the commit. If the gate fails, the fix happens in-place before the code reaches the history. The alternative — “run all the tests at the end” — produces a compound failure mode where taskset 7’s failure might be rooted in taskset 3’s silent break. Gates localize cost. A failing gate at taskset 5 is a local problem; a failing global check after taskset 8 is a bisect exercise.

The commercial impact

Customers and operators budget for “integration drift” when shipping migrations. That drift is usually a multi-hour rollback + triage phase at the end. Gated tasksets collapse that drift to zero — the drift never accrues. For teams shipping weekly migrations, this pattern compounds: in a year, 50-100 migrations × 1-2 hours of drift per migration = 50-200 hours of direct engineering time avoided. On top of that, customer-visible incidents from half-rolled migrations disappear.

Pattern to repeat

Name each taskset with a single clear goal.
Declare the gate as an executable command, not an intent.
Commit at gate-green, with conventional-commits scope matching the taskset.
If the gate fails, fix before committing. Never commit “fix” separately unless a real latent issue was found.

When NOT to apply

One-file fixes (a typo, a constant change) don’t need a taskset structure — the commit is the gate. The pattern earns its keep when a session produces ≥5 commits across ≥3 domains.

​What happened

​Why gates work

​The commercial impact

​Pattern to repeat

​When NOT to apply

What happened

Why gates work

The commercial impact

Pattern to repeat

When NOT to apply