FORGE + TOMMY Autonomous Execution System
What Is This System?
FORGE and TOMMY are two n8n workflows that together form an autonomous code execution pipeline. FORGE is the strategic planner: it reads a GitHub epic, resolves which block is unblocked and ready, fetches every task’s acceptance criteria, and assembles a fully self-contained implementation prompt. TOMMY is the tactical executor: it receives tasks, atomises them into individual task-lets, SSHes into a build server, and runs Claude Code CLI per task-let. On success, FORGE auto-closes the GitHub issues with summary comments. The system runs on a single n8n instance (hab.so1.io) and targets any GitHub org/repo with the right epic structure.
TMLI5: The Factory and the Robot
Imagine a to-do list (the GitHub epic) with chapters (blocks) and pages (tasks). FORGE is the project manager who reads the to-do list, figures out which chapter isn’t blocked by anything else, and writes detailed instructions for each page. TOMMY is the robot that reads those instructions, walks over to the computer, and types the code. When everything works, the project manager crosses the items off the list.Quick Reference
| Property | FORGE | TOMMY |
|---|---|---|
| Workflow ID | mSJmBzpIcuCKz1WT | MasflGBKdowUZwpJ |
| Node count | 44 | 34 |
| Webhook path | /webhook/forge-resolve-next-ticket | /webhook/atomiser |
| n8n instance | hab.so1.io | hab.so1.io |
| Execution modes | manual, api, tommy | standalone, forge-delegated |
| LLM calls (orchestration) | 0 | 0–1 (Phase 2 only) |
| GitHub credential | FhglvXmXMeC2SQTl | tSg1ejqnd5r3clEg |
| Default timeout | 120s (manual/api), 600s (tommy) | 600s |
GitHub Issue Data Model
Technical Deep-Dive
FORGE operates on a three-tier GitHub issue hierarchy:- Epic — The top-level issue containing a markdown table of blocks
- Block — A phase or milestone containing a markdown table of tasks
- Task — An individual unit of work with acceptance criteria and file targets
blocked_by other blocks. FORGE resolves these dependencies to find the next unblocked block.
Two table formats are supported in epic and block bodies:
| Format | Header Pattern | Used By |
|---|---|---|
| Format A | Issue | Block | Title | Labels | Sparki |
| Format B | Block | Issue | Phase | Domain | Title | Priority | Status | Traceo |
TMLI5: The To-Do List
An Epic is a big project — like “Build the website”. It has Blocks which are chapters — “Set up the database”, “Build the API”, “Deploy”. Some chapters can’t start until earlier ones finish (that’s theblocked_by edge). Each block has Tasks — the individual pages you actually write code for.
FORGE: The Strategic Planner
How FORGE Works
FORGE is a 44-node n8n workflow that reads a GitHub epic and produces an implementation prompt. It has six logical phases:- Configuration — Extract webhook parameters
- Epic Parsing — Fetch the epic issue, parse block table
- Block Resolution — Classify all blocks, select the next ready one
- Task Extraction — Parse the selected block for tasks, fetch each task issue
- Prompt Assembly — Build a self-contained implementation prompt
- Execution — Return prompt (manual), call Claude API (api), or delegate to TOMMY (tommy)
TMLI5: The Project Manager
FORGE is the project manager who:- Opens the project board (fetches the epic)
- Reads all the chapters (parses blocks)
- Checks which chapters are done or blocked (classifies blocks)
- Picks the first unblocked chapter (resolves next block)
- Reads every task in that chapter (fetches task issues)
- Writes a detailed instruction sheet (assembles prompt)
- Either hands it to you, does it herself, or sends it to the robot
Key Code Nodes
Parse Epic Body — Handles two table formats by inspecting header columns. Only consumes tables with both “block” and “issue” columns to avoid pollution from unrelated tables (like A2A handoff maps) in the epic body. Extractsissue_number, block_number, title, labels, and blocked_by per block.
Classify Block Status — Vectorized with $input.all().map(). Receives all block issues from GitHub in one batch. Looks up the original block metadata from Split Blocks to Items via cross-node reference. Classifies each block as Ready, Blocked, InProgress, or Done based on issue state and dependency resolution.
Ready. If no blocks are ready, returns a NO_READY_BLOCKS error item that propagates downstream.
Assemble FORGE Prompt — Builds a ~5,000–6,500 character implementation prompt containing:
- Epic context (title, number, description)
- Block context (title, number, dependency status)
- Board status (all blocks with their classifications)
- Task list with acceptance criteria and file targets
- Generated timestamp
Set Configuration (10 Fields)
| Field | Source | Default |
|---|---|---|
gh_org | $json.body.gh_org | (required) |
gh_repo | $json.body.gh_repo | (required) |
epic_issue | $json.body.epic_issue_number | (required) |
override_block | $json.body.block_issue_number | "" |
exec_mode | $json.body.execution_mode | "manual" |
target_repo | $json.body.target_repo_for_code | "" |
api_base | (hardcoded) | https://api.github.com |
claude_key | $json.body.claude_api_key | "" |
ssh_host | $json.body.ssh_host | root@vps.devarno.cloud |
project_path | $json.body.project_path | /root/projects/so1-platform |
TOMMY: The Tactical Executor
How TOMMY Works
TOMMY is a 34-node n8n workflow that decomposes work into atomic task-lets and executes each via SSH + Claude Code CLI. Five phases:- INIT — Merge webhook input with defaults, check for FORGE shortcut
- INSPECT — Fetch repo tree, pipeline.yaml, README from GitHub API
- IDENTIFY — Call Claude to identify tasksets (or parse existing pipeline.yaml)
- ATOMISE — Convert tasksets into individual task-let prompts
- EXECUTE — SplitInBatches loop: SSH → Claude CLI per task-let
- CONFIRM — Aggregate results, return ALL_PASS or HAS_FAILURES
forge_tasks, TOMMY skips directly from INIT to ATOMISE (Path C).
TMLI5: The Robot Arm
TOMMY is like a smart robot arm on a factory floor:- It checks if someone already gave it a parts list (FORGE shortcut) — if yes, skip straight to building
- Otherwise, it inspects the raw materials (repo tree) and asks a consultant what to build (Claude)
- It splits the work into individual steps (atomise)
- For each step, it walks to the workbench (SSH), picks up the tool (Claude CLI), and builds the part
- At the end, it checks: did everything pass? If yes, report success. If no, list what broke.
Three Atomisation Paths
| Path | Trigger | LLM Tokens | Phases Executed |
|---|---|---|---|
| A | pipeline.yaml exists in repo | 0 | 1 → 3 → 4 → 5 |
| B | No pipeline, no FORGE tasks | ~1,100 | 1 → 2 → 3 → 4 → 5 |
| C | forge_tasks[] in webhook payload | 0 | 0 → 3 → 4 → 5 |
SSH Execution Command
| Flag | Purpose |
|---|---|
--print | Non-interactive, stdout output |
--output-format json | Structured output for result parsing |
--max-turns 3 | Cap reasoning depth (token budget control) |
--dangerously-skip-permissions | Automated execution, no confirmation prompts |
Response Schemas
Success (ALL_PASS)FORGE to TOMMY Delegation Handshake
Technical Deep-Dive
When FORGE’sExecution Mode switch routes to the tommy output, the Build TOMMY Payload node constructs a webhook body and Trigger TOMMY sends a synchronous POST to /webhook/atomiser with a 600-second timeout.
Webhook Payload Contract
FORGE sends this payload to TOMMY:forge_tasks array and takes Path C — skipping Phases 1-2 entirely.
TMLI5: The Letter
FORGE writes a letter to TOMMY:“Dear TOMMY, here’s the repo and the SSH address. Here are 3 tasks to do — each one has a title, description, and which files to change. The project is about Production Readiness, and we’re working on the Deployment chapter. Please do the tasks and write back.”TOMMY reads the letter, does the tasks one by one, and writes back either “All done!” or “2 out of 3 worked, here’s what broke.”
Three Execution Modes
| Mode | What Happens | Human Role | Timeout | Issue Closure |
|---|---|---|---|---|
| manual | Prompt returned via webhook response | Copy-paste into Claude | 120s | None |
| api | Claude API called from n8n directly | Review after completion | 120s | Auto (on success) |
| tommy | TOMMY executes via SSH + Claude CLI | Monitor; intervene on failure | 600s | Auto (on ALL_PASS) |
Timeout Chain
hab.so1.io/api/v1/executions/{id}.
TMLI5: Three Ways to Cook
- Manual: “Here’s the recipe card. You cook it.”
- API: “I’ll cook it right now in my kitchen, and clean up after.”
- TOMMY: “I’ll send the recipe to the robot chef. If the robot says ‘done’, I’ll clean up. If it says ‘burnt the soufflé’, I’ll tell you what went wrong.”
Combined Lifecycle: Epic to Issue Closure
The dotted concept: once a block is closed, trigger FORGE again and it selects the next unblocked block. This loop can run unattended — each invocation is stateless and reads the current board state.Data Shapes at Each Step
| Step | Shape | Example |
|---|---|---|
| After Parse Epic Body | {blocks: [{issue_num, block_number, title}]} | 22 blocks |
| After Classify Block Status | [{issue_number, status, blocked_by}] | 22 classified |
| After Resolve & Select | {selected_block: {issue_number, title}} | 1 block |
| After Aggregate Tasks | {tasks: [{title, file_targets, criteria}]} | 1–5 tasks |
| After Assemble Prompt | {prompt: "# FORGE...", block, task_count} | ~5,200 chars |
| TOMMY Response | {status: "SUCCESS", tasklets: [...]} | Per task-let |
TMLI5: The Full Journey
- Read the project board (22 chapters listed)
- Ask GitHub about each chapter (22 API calls)
- Figure out which ones are done, blocked, or ready
- Pick the first ready chapter
- Read all the tasks in that chapter
- Write a detailed instruction sheet
- Hand it to the robot → robot SSHes in → runs Claude → reports back
- If everything passed: cross off the tasks and the chapter
- Trigger again for the next chapter
Optimizations Applied
Vectorization: SplitInBatches Removal
The original workflow used SplitInBatches feedback loops to process blocks one at a time. This caused two problems:options.reset: trueinfinite loop — Each loop-back item replaced the batch queue, causing block[0] to process forever$('NodeName').all()limitation — Inside a SplitInBatches loop, cross-node references only return the current iteration’s items, not accumulated results
Iterate Blocks and Iterate Tasks SplitInBatches nodes still exist in the workflow JSON (set to v3 with empty options) but are disconnected from the execution path. They serve as visual markers in the n8n editor.
Field Name Fix
Root cause of executions 125–129 failing:Set Configuration reads $json.body.epic_issue_number but test payloads sent epic_number. The GitHub URL became .../issues/NaN → 404.
Duplicate Node Cleanup
FORGE historically contained 77 nodes — 34 were WIP backup copies positioned at negative x-coordinates. Stripped to 44 clean nodes before deployment.Table Format Discrimination
Parse Epic Body now inspects table header columns before consuming a table. Only tables with both “block” and “issue” columns are processed. This prevents pollution from unrelated tables in epic bodies.
Segmentation & Refactoring Opportunities
1. Parse Epic Body → Sub-Workflow
Current state: A single 7,347-character Code node handling two table formats (Sparki + Traceo) with inline regex, column detection, and markdown stripping. Proposed: Extract into an n8n sub-workflow with three nodes:- Detect Format — Inspect headers, emit format identifier
- Parse Table — Format-specific parser (one per format)
- Normalize — Output consistent
{issue_number, block_number, title, blocked_by}array
2. Duplicated Issue Closure Chain
Current state: Both API mode (nodes 22–28) and TOMMY mode (nodes 37–43) have near-identical closure logic: Comment on Task → Close Task → Comment on Block → Close Block → Return Result. Proposed: Extract into a shared “Close Issues” sub-workflow. Both modes feed into it after their respective success checks. Benefit: Bug fixes to closure logic only need to happen once. Currently a fix to the API closure chain must be manually replicated to the TOMMY chain.3. TOMMY Atomise Node → Separate Code Nodes
Current state: A single 5,708-character Code node with three internal paths (A: pipeline.yaml, B: LLM-identified, C: FORGE-delegated). Proposed: Split into three Code nodes behind a Switch:Atomise from Pipeline(Path A)Atomise from LLM(Path B)Atomise from FORGE(Path C)
4. Configurable Table Format Detection
Current state: Format A (Sparki) and Format B (Traceo) are hardcoded in the parser. Proposed: Accept atable_format hint via webhook payload or epic labels. Fall back to auto-detection if not provided.
5. Shared Error Handling
Current state: Error propagation uses aNO_READY_BLOCKS or NO_TASKS_FOUND sentinel item that flows downstream. Each Code node must check for it.
Proposed: Extract into a shared error-handler sub-workflow with a standard error envelope. Use n8n’s native error trigger for crash recovery.
Expansion & Scalability Opportunities
1. Multi-Repo Execution
FORGE already acceptstarget_repo_for_code in the webhook payload. The intended use case: issues tracked in repo A, code executed in repo B. Currently untested end-to-end but the plumbing exists.
2. Parallel Block Execution
Currently FORGE processes one block per invocation. Independent blocks (noblocked_by edges between them) could be executed in parallel:
- FORGE identifies all ready blocks instead of just the first
- Spawns a TOMMY instance per block
- Aggregates results before closure
3. Pipeline.yaml as Persistent Execution State
TOMMY Path A readspipeline.yaml but never writes back. Completed task-lets should update the pipeline state, enabling:
- Idempotent re-runs — Skip already-completed task-lets
- Progress tracking — See what’s done without checking GitHub
- Resume after failure — Re-trigger and only execute remaining tasks
4. ZOID Council Integration
The planned 8-agent orchestration system adds human-in-the-loop gates at strategic points: Three ZOID agents participate:- Atlas (Chief of Staff) — Decides block priority and execution order
- Anvil (CodeSmith) — Reviews code-level decisions at the block gate
- Sentinel (Ops) — Validates deployment safety at the closure gate
5. Cost Tracking & Budget Enforcement
TOMMY already trackstoken_budget in its response. Expansion:
- Set per-block and per-epic budget limits in the webhook payload
- TOMMY enforces budget mid-execution: if tokens exceed limit, stop and report
- Integration with Anthropic usage API for accurate spend tracking
- Aggregate cost reporting across blocks and epics
6. Observability Dashboard
n8n execution history provides raw data. Integration with Grafana/Prometheus:- Execution duration per phase
- Success/failure rate per repo
- Token spend per block
- SSH command latency distribution
- GitHub API call volume
7. A2A (Agent-to-Agent) Protocol
The current FORGE→TOMMY handshake uses a custom webhook JSON contract. Adopting a formal A2A protocol would enable:- Third-party agents to participate in the pipeline
- Standard capability negotiation
- Structured task delegation with SLA contracts
- Cross-platform interoperability
The Autonomy Ladder
| Level | Description | Human Role | Status |
|---|---|---|---|
| L1 | Manual mode — prompt returned | Write the code | Live |
| L2 | API mode — Claude API executes | Review results | Live |
| L3 | TOMMY mode — SSH + CLI executes | Monitor failures | Live |
| L4 | ZOID gates — Discord emoji approval | Approve at gates | Planned |
| L5 | Full loop — re-trigger on success | Sleep | Future |
Trigger Interface
Via Rover UI
The Rover console atrover.so1.io provides dedicated forms:
- ForgeForm — auto-detected when webhook path contains “forge”
- TommyForm — auto-detected when webhook path contains “atomiser”
Via curl
Manual mode (get the prompt back):Troubleshooting
Fetch Epic Issue returns 404
Most common cause: Field name mismatch. The webhook payload must useepic_issue_number (not epic_number, not epic_issue). Check Set Configuration outputs in the execution trace — if epic_issue is NaN or undefined, the field name is wrong.
Credentials detach after API update
n8n’s PUT API strips credential bindings from all nodes. After any programmatic workflow update:- Open the workflow in n8n editor at
hab.so1.io - Re-select the credential on every HTTP Request and SSH node
- Save the workflow via the editor (not the API)
TOMMY timeout
The timeout chain has multiple layers:| Layer | Default | Configurable? |
|---|---|---|
| Rover UI | 600s (tommy), 120s (manual/api) | In ForgeForm component |
| Vercel serverless | 300s (maxDuration) | In route.ts |
| BFF proxy | 120s default | Per-request timeoutMs |
| n8n internal | 600s | Workflow settings |
| SSH per task-let | Unlimited | --max-turns on Claude CLI |
hab.so1.io/api/v1/executions/{id}?includeData=true for the actual result.
No ready blocks found
FORGE returnsNO_READY_BLOCKS when all blocks are Done, Blocked, or InProgress. Check:
- Are all blocks closed? (The epic may be complete)
- Are dependency edges (
blocked_by) correct in the block issues? - Is there a circular dependency? (Block A blocked by B, B blocked by A)
TOMMY returns PARTIAL_FAILURE
Check thefailures array in the response for per-task-let errors:
- SSH connection refused: Verify
ssh_hostis reachable and SSH key is configured in n8n - Claude CLI not found: Ensure Claude Code is installed on the target server at the expected path
- project_path doesn’t exist: Verify the repo is cloned at the specified path on the SSH host
- Exit code 1: Claude CLI encountered an error — check the
errorfield for the Claude output
retry_command to re-execute only the failed task-lets.
SplitInBatches infinite loop
If you see execution times growing unboundedly or node execution counts in the hundreds, check foroptions.reset: true on any SplitInBatches node. With reset: true, each loop-back item replaces the batch queue — causing the first item to reprocess forever. Set reset: false or remove from the execution path.
Companion Artifacts
| Artifact | Type | ID / Link |
|---|---|---|
| FORGE Workflow JSON | Local file | /FORGE Resolve-Next-Ticket.json (untracked) |
| TOMMY Workflow JSON | Local file | /TOMMY.json (untracked) |
| FORGE Skill File | Skill | .opencode/skills/FORGE.skill.md |
| TOMMY Skill File | Skill | .opencode/skills/TOMMY.skill.md |
| Resolve-next-ticket workflow gist | ctx:proto | a30390ef |
| Taskset Template Engine | Gist | c30ea47a |
| FORGE A2A Handbook | Gist | dc28f6be |
| OpenClaw + FORGE Integration | Docs | standalone/zoid/docs/system/OPENCLAW-FORGE-INTEGRATION.md |
Related Pages
- FORGE Stages — The 6-stage FORGE transformation pipeline (creation side)
- Autonomous Execution — Unattended execution model
- Archive System v2 — How session knowledge flows to repos after execution
- Veritas Integration — Prompt library backing FORGE templates