Skip to main content

What happened

A user-supplied 4-taskset plan (α Battle Room/Chamber editors, β Pipeline authoring, γ Intelligence recommendations, δ Gated Async + parity) was delivered internally consistent but resting on three assumptions that the repo exploration pass flagged as load-bearing:
  1. Taskset δ’s parity gate assumed real Blake3 hashing existed. Exploration showed the ledger wrote "blake3:pending" as a literal string. Parity tests would be cosmetic, not cryptographic. Three plausible paths: implement real hashing as part of δ, keep the placeholder and test around it, split hashing into a separate taskset.
  2. Taskset β’s PipelineBuilder assumed chain units could declare per-stage output types. Exploration showed ChainStepSchema had {id, unit, agent, gate, depends_on} — no output field. Three paths: extend the schema with optional output, infer output from each step’s referenced unit contract, or drop per-stage output from the plan entirely.
  3. Taskset δ’s audit deep link assumed one canonical audit signer. Exploration showed the repo had two mint paths (Meridian Astro route + orchestrator-http /audit/mint), each with its own keypair and kid. Three paths: make orchestrator-http canonical with Meridian proxying, keep both and document the divergence, or make Meridian canonical and drop the HTTP route.
Rather than assume, I surfaced all three as AskUserQuestion prompts with labelled options (each option tagged (Recommended) on the path I thought was strongest). User answered in under two minutes total. All three answers matched the recommendation. The plan file was then written with those three decisions baked in, ExitPlanMode was called, execution proceeded from a 26-subitem TaskList. No mid-execution pivots. No rework.

Why this is repeatable leverage

The shape of the decision capture matters:
  • Present the trade-off, not the recommendation alone. Three options per question, each with a concrete description of the trade-off. The recommended path is first and tagged — but the other paths are legitimate enough that “actually, let’s do path 2” is a one-click pivot, not a debate.
  • Anchor each question to a specific piece of evidence from exploration. “The blake3: field is currently the literal string blake3:pending — how should we treat this?” is answerable. “What’s your preferred Blake3 strategy?” is not.
  • Do all three (or four) questions in one AskUserQuestion call. Batch cost. Users answer them as a group, often in under a minute. Asking sequentially would have cost three context switches plus three decision-forming delays.

Why it beats “just ask in the chat”

Chat questions are unstructured text. The user has to generate an answer from scratch, remember context about the surrounding work, and commit to a position that might have been better presented with alternatives they hadn’t considered. AskUserQuestion with options turns it into a multiple-choice exam with a confident recommendation. The user can either agree (default) or diverge (explicit click). Both paths are fast. The decision is captured in structured form that propagates into the plan file as “three scope decisions baked in,” making it reviewable later.

What we’re reusing

  • Budget one AskUserQuestion block per plan-mode session for load-bearing assumptions. Not for every minor choice — for the 2-4 decisions that would cause rework if wrong. Exploration flags candidates; Q&A resolves them.
  • Include the paired prompt-op that shows the Explore-partition prompts. Without good exploration, the questions are speculative. The two techniques chain.
  • Label the recommended option, always first. Users with less context than you still get a working answer on “Enter”; users with more context can still pivot.

What we learned this session

In all three cases, the recommended answer was the one user picked — but in one case the delta between option 1 and option 2 was narrow enough that the option structure itself was valuable. Without the structured choice, we’d likely have implemented option 2 by default, shipped it, and found the divergence later. The decision capture was cheap insurance. Net: the three-decision pattern generalises beyond this session. Run it on every non-trivial plan.