Propose-Confirm-Apply: Safety Model for LLM Write Tools

Olly’s Phase 1 was stuck at “read-only sidekick” because write tools (file edits, command execution) need safety gates. The solution: a three-tier policy model where propose_patch is read-permission (model can freely explore changes), apply_patch requires confirmation, and run_command checks against an allowlist/denylist before requiring confirmation. The key architectural insight is that proposing a change is fundamentally safe — it’s just generating a diff string. Separating proposal from application lets the model iterate on solutions without risk, while the user retains a single confirmation gate before any file is modified or command is executed. Key learning: For LLM agent tool systems, separate read-safe exploration from write-dangerous execution. Make proposals cheap (Allow), make application explicit (Confirm), and make destruction impossible (Deny). The model works better when it can freely propose without friction.