Skip to main content

FORGE + TOMMY Autonomous Execution System

What Is This System?

FORGE and TOMMY are two n8n workflows that together form an autonomous code execution pipeline. FORGE is the strategic planner: it reads a GitHub epic, resolves which block is unblocked and ready, fetches every task’s acceptance criteria, and assembles a fully self-contained implementation prompt. TOMMY is the tactical executor: it receives tasks, atomises them into individual task-lets, SSHes into a build server, and runs Claude Code CLI per task-let. On success, FORGE auto-closes the GitHub issues with summary comments. The system runs on a single n8n instance (hab.so1.io) and targets any GitHub org/repo with the right epic structure.

TMLI5: The Factory and the Robot

Imagine a to-do list (the GitHub epic) with chapters (blocks) and pages (tasks). FORGE is the project manager who reads the to-do list, figures out which chapter isn’t blocked by anything else, and writes detailed instructions for each page. TOMMY is the robot that reads those instructions, walks over to the computer, and types the code. When everything works, the project manager crosses the items off the list.

Quick Reference

PropertyFORGETOMMY
Workflow IDmSJmBzpIcuCKz1WTMasflGBKdowUZwpJ
Node count4434
Webhook path/webhook/forge-resolve-next-ticket/webhook/atomiser
n8n instancehab.so1.iohab.so1.io
Execution modesmanual, api, tommystandalone, forge-delegated
LLM calls (orchestration)00–1 (Phase 2 only)
GitHub credentialFhglvXmXMeC2SQTltSg1ejqnd5r3clEg
Default timeout120s (manual/api), 600s (tommy)600s

GitHub Issue Data Model

Technical Deep-Dive

FORGE operates on a three-tier GitHub issue hierarchy:
  • Epic — The top-level issue containing a markdown table of blocks
  • Block — A phase or milestone containing a markdown table of tasks
  • Task — An individual unit of work with acceptance criteria and file targets
Blocks have dependency edges: a block can be blocked_by other blocks. FORGE resolves these dependencies to find the next unblocked block. Two table formats are supported in epic and block bodies:
FormatHeader PatternUsed By
Format AIssue | Block | Title | LabelsSparki
Format BBlock | Issue | Phase | Domain | Title | Priority | StatusTraceo

TMLI5: The To-Do List

An Epic is a big project — like “Build the website”. It has Blocks which are chapters — “Set up the database”, “Build the API”, “Deploy”. Some chapters can’t start until earlier ones finish (that’s the blocked_by edge). Each block has Tasks — the individual pages you actually write code for.

FORGE: The Strategic Planner

How FORGE Works

FORGE is a 44-node n8n workflow that reads a GitHub epic and produces an implementation prompt. It has six logical phases:
  1. Configuration — Extract webhook parameters
  2. Epic Parsing — Fetch the epic issue, parse block table
  3. Block Resolution — Classify all blocks, select the next ready one
  4. Task Extraction — Parse the selected block for tasks, fetch each task issue
  5. Prompt Assembly — Build a self-contained implementation prompt
  6. Execution — Return prompt (manual), call Claude API (api), or delegate to TOMMY (tommy)

TMLI5: The Project Manager

FORGE is the project manager who:
  1. Opens the project board (fetches the epic)
  2. Reads all the chapters (parses blocks)
  3. Checks which chapters are done or blocked (classifies blocks)
  4. Picks the first unblocked chapter (resolves next block)
  5. Reads every task in that chapter (fetches task issues)
  6. Writes a detailed instruction sheet (assembles prompt)
  7. Either hands it to you, does it herself, or sends it to the robot

Key Code Nodes

Parse Epic Body — Handles two table formats by inspecting header columns. Only consumes tables with both “block” and “issue” columns to avoid pollution from unrelated tables (like A2A handoff maps) in the epic body. Extracts issue_number, block_number, title, labels, and blocked_by per block. Classify Block Status — Vectorized with $input.all().map(). Receives all block issues from GitHub in one batch. Looks up the original block metadata from Split Blocks to Items via cross-node reference. Classifies each block as Ready, Blocked, InProgress, or Done based on issue state and dependency resolution.
// Vectorized classification — processes all blocks in one pass
const allOriginalBlocks = $('Split Blocks to Items').all().map(i => i.json);
return $input.all().map(inputItem => {
  const block = allOriginalBlocks.find(b => b.issue_number === inputItem.json.number) || {};
  const issueState = inputItem.json.state;
  // ... dependency resolution, status determination
  return { json: { issue_number, block_number, status, blocked_by, tasks, epic, config } };
});
Resolve & Select Next Block — Receives all classified blocks. Sorts by block number. Returns the first block with status Ready. If no blocks are ready, returns a NO_READY_BLOCKS error item that propagates downstream. Assemble FORGE Prompt — Builds a ~5,000–6,500 character implementation prompt containing:
  • Epic context (title, number, description)
  • Block context (title, number, dependency status)
  • Board status (all blocks with their classifications)
  • Task list with acceptance criteria and file targets
  • Generated timestamp

Set Configuration (10 Fields)

FieldSourceDefault
gh_org$json.body.gh_org(required)
gh_repo$json.body.gh_repo(required)
epic_issue$json.body.epic_issue_number(required)
override_block$json.body.block_issue_number""
exec_mode$json.body.execution_mode"manual"
target_repo$json.body.target_repo_for_code""
api_base(hardcoded)https://api.github.com
claude_key$json.body.claude_api_key""
ssh_host$json.body.ssh_hostroot@vps.devarno.cloud
project_path$json.body.project_path/root/projects/so1-platform

TOMMY: The Tactical Executor

How TOMMY Works

TOMMY is a 34-node n8n workflow that decomposes work into atomic task-lets and executes each via SSH + Claude Code CLI. Five phases:
  1. INIT — Merge webhook input with defaults, check for FORGE shortcut
  2. INSPECT — Fetch repo tree, pipeline.yaml, README from GitHub API
  3. IDENTIFY — Call Claude to identify tasksets (or parse existing pipeline.yaml)
  4. ATOMISE — Convert tasksets into individual task-let prompts
  5. EXECUTE — SplitInBatches loop: SSH → Claude CLI per task-let
  6. CONFIRM — Aggregate results, return ALL_PASS or HAS_FAILURES
When FORGE provides forge_tasks, TOMMY skips directly from INIT to ATOMISE (Path C).

TMLI5: The Robot Arm

TOMMY is like a smart robot arm on a factory floor:
  1. It checks if someone already gave it a parts list (FORGE shortcut) — if yes, skip straight to building
  2. Otherwise, it inspects the raw materials (repo tree) and asks a consultant what to build (Claude)
  3. It splits the work into individual steps (atomise)
  4. For each step, it walks to the workbench (SSH), picks up the tool (Claude CLI), and builds the part
  5. At the end, it checks: did everything pass? If yes, report success. If no, list what broke.

Three Atomisation Paths

PathTriggerLLM TokensPhases Executed
Apipeline.yaml exists in repo01 → 3 → 4 → 5
BNo pipeline, no FORGE tasks~1,1001 → 2 → 3 → 4 → 5
Cforge_tasks[] in webhook payload00 → 3 → 4 → 5
Path C is the FORGE+TOMMY sweet spot: FORGE already did the strategic thinking, so TOMMY uses zero orchestration tokens. The only LLM calls happen inside Claude Code CLI on the build server.

SSH Execution Command

cd <project_path> && claude --print --output-format json --max-turns 3 \
  --dangerously-skip-permissions -p '<escaped_prompt>'
FlagPurpose
--printNon-interactive, stdout output
--output-format jsonStructured output for result parsing
--max-turns 3Cap reasoning depth (token budget control)
--dangerously-skip-permissionsAutomated execution, no confirmation prompts

Response Schemas

Success (ALL_PASS)
{
  "status": "SUCCESS",
  "message": "Block complete. 3/3 task-lets executed successfully.",
  "is_dry_run": false,
  "token_budget": { "phase_2_identify": 0, "phase_4_execute": 4500, "total_estimated": 4500 },
  "run_id": "run_1711234567890_abc123",
  "tasklets": [{ "id": "TS-1", "title": "...", "status": "completed" }]
}
Failure (HAS_FAILURES)
{
  "status": "PARTIAL_FAILURE",
  "message": "Block incomplete. 2/3 passed, 1 failed.",
  "failures": [{ "id": "TS-2", "title": "...", "error": "...", "exit_code": 1 }],
  "retry_command": "curl -X POST .../webhook/atomiser -d '{\"taskset_filter\": [2]}'"
}

FORGE to TOMMY Delegation Handshake

Technical Deep-Dive

When FORGE’s Execution Mode switch routes to the tommy output, the Build TOMMY Payload node constructs a webhook body and Trigger TOMMY sends a synchronous POST to /webhook/atomiser with a 600-second timeout.

Webhook Payload Contract

FORGE sends this payload to TOMMY:
{
  "repo": "traceo-ai/traceo-mcp-server",
  "branch": "main",
  "project_path": "/root/projects/so1-platform",
  "ssh_host": "root@vps.devarno.cloud",
  "dry_run": false,
  "max_concurrency": 1,
  "delay_between_ms": 5000,
  "forge_tasks": [
    {
      "title": "Implement health check endpoint",
      "description": "Add GET /health that returns 200 with version...",
      "issue_number": 42,
      "html_url": "https://github.com/traceo-ai/traceo-mcp-server/issues/42",
      "file_targets": ["src/routes/health.ts", "src/index.ts"],
      "acceptance_criteria": "GET /health returns 200 with {status, version}"
    }
  ],
  "forge_context": {
    "epic_title": "Production Readiness",
    "epic_number": 22,
    "block_title": "BLOCK 5: Deployment & Go-Live",
    "block_number": 27
  }
}
TOMMY recognizes the forge_tasks array and takes Path C — skipping Phases 1-2 entirely.

TMLI5: The Letter

FORGE writes a letter to TOMMY:
“Dear TOMMY, here’s the repo and the SSH address. Here are 3 tasks to do — each one has a title, description, and which files to change. The project is about Production Readiness, and we’re working on the Deployment chapter. Please do the tasks and write back.”
TOMMY reads the letter, does the tasks one by one, and writes back either “All done!” or “2 out of 3 worked, here’s what broke.”

Three Execution Modes

ModeWhat HappensHuman RoleTimeoutIssue Closure
manualPrompt returned via webhook responseCopy-paste into Claude120sNone
apiClaude API called from n8n directlyReview after completion120sAuto (on success)
tommyTOMMY executes via SSH + Claude CLIMonitor; intervene on failure600sAuto (on ALL_PASS)

Timeout Chain

Rover UI (600s) → Vercel Proxy (300s maxDuration) → BFF (configurable) → n8n Webhook (600s) → TOMMY SSH (per task-let)
The weakest link is the Vercel serverless function limit at 300s (Pro plan). For tommy-mode executions exceeding 5 minutes, the Rover UI will time out — but the n8n execution continues regardless. Check execution status via hab.so1.io/api/v1/executions/{id}.

TMLI5: Three Ways to Cook

  • Manual: “Here’s the recipe card. You cook it.”
  • API: “I’ll cook it right now in my kitchen, and clean up after.”
  • TOMMY: “I’ll send the recipe to the robot chef. If the robot says ‘done’, I’ll clean up. If it says ‘burnt the soufflé’, I’ll tell you what went wrong.”

Combined Lifecycle: Epic to Issue Closure

The dotted concept: once a block is closed, trigger FORGE again and it selects the next unblocked block. This loop can run unattended — each invocation is stateless and reads the current board state.

Data Shapes at Each Step

StepShapeExample
After Parse Epic Body{blocks: [{issue_num, block_number, title}]}22 blocks
After Classify Block Status[{issue_number, status, blocked_by}]22 classified
After Resolve & Select{selected_block: {issue_number, title}}1 block
After Aggregate Tasks{tasks: [{title, file_targets, criteria}]}1–5 tasks
After Assemble Prompt{prompt: "# FORGE...", block, task_count}~5,200 chars
TOMMY Response{status: "SUCCESS", tasklets: [...]}Per task-let

TMLI5: The Full Journey

  1. Read the project board (22 chapters listed)
  2. Ask GitHub about each chapter (22 API calls)
  3. Figure out which ones are done, blocked, or ready
  4. Pick the first ready chapter
  5. Read all the tasks in that chapter
  6. Write a detailed instruction sheet
  7. Hand it to the robot → robot SSHes in → runs Claude → reports back
  8. If everything passed: cross off the tasks and the chapter
  9. Trigger again for the next chapter

Optimizations Applied

Vectorization: SplitInBatches Removal

The original workflow used SplitInBatches feedback loops to process blocks one at a time. This caused two problems:
  1. options.reset: true infinite loop — Each loop-back item replaced the batch queue, causing block[0] to process forever
  2. $('NodeName').all() limitation — Inside a SplitInBatches loop, cross-node references only return the current iteration’s items, not accumulated results
Fix: Removed SplitInBatches from the execution path. n8n natively passes all items through connected nodes:
Split Blocks to Items (22 items)
  → Fetch Block Issue (22 HTTP calls, automatic)
    → Classify Block Status ($input.all().map() — processes all 22)
      → Resolve & Select Next Block (receives all 22)
The Iterate Blocks and Iterate Tasks SplitInBatches nodes still exist in the workflow JSON (set to v3 with empty options) but are disconnected from the execution path. They serve as visual markers in the n8n editor.

Field Name Fix

Root cause of executions 125–129 failing: Set Configuration reads $json.body.epic_issue_number but test payloads sent epic_number. The GitHub URL became .../issues/NaN → 404.

Duplicate Node Cleanup

FORGE historically contained 77 nodes — 34 were WIP backup copies positioned at negative x-coordinates. Stripped to 44 clean nodes before deployment.

Table Format Discrimination

Parse Epic Body now inspects table header columns before consuming a table. Only tables with both “block” and “issue” columns are processed. This prevents pollution from unrelated tables in epic bodies.

Segmentation & Refactoring Opportunities

1. Parse Epic Body → Sub-Workflow

Current state: A single 7,347-character Code node handling two table formats (Sparki + Traceo) with inline regex, column detection, and markdown stripping. Proposed: Extract into an n8n sub-workflow with three nodes:
  • Detect Format — Inspect headers, emit format identifier
  • Parse Table — Format-specific parser (one per format)
  • Normalize — Output consistent {issue_number, block_number, title, blocked_by} array
Benefit: New table formats can be added without modifying the monolithic parser.

2. Duplicated Issue Closure Chain

Current state: Both API mode (nodes 22–28) and TOMMY mode (nodes 37–43) have near-identical closure logic: Comment on Task → Close Task → Comment on Block → Close Block → Return Result. Proposed: Extract into a shared “Close Issues” sub-workflow. Both modes feed into it after their respective success checks. Benefit: Bug fixes to closure logic only need to happen once. Currently a fix to the API closure chain must be manually replicated to the TOMMY chain.

3. TOMMY Atomise Node → Separate Code Nodes

Current state: A single 5,708-character Code node with three internal paths (A: pipeline.yaml, B: LLM-identified, C: FORGE-delegated). Proposed: Split into three Code nodes behind a Switch:
  • Atomise from Pipeline (Path A)
  • Atomise from LLM (Path B)
  • Atomise from FORGE (Path C)
Benefit: Each path can be tested and modified independently.

4. Configurable Table Format Detection

Current state: Format A (Sparki) and Format B (Traceo) are hardcoded in the parser. Proposed: Accept a table_format hint via webhook payload or epic labels. Fall back to auto-detection if not provided.

5. Shared Error Handling

Current state: Error propagation uses a NO_READY_BLOCKS or NO_TASKS_FOUND sentinel item that flows downstream. Each Code node must check for it. Proposed: Extract into a shared error-handler sub-workflow with a standard error envelope. Use n8n’s native error trigger for crash recovery.

Expansion & Scalability Opportunities

1. Multi-Repo Execution

FORGE already accepts target_repo_for_code in the webhook payload. The intended use case: issues tracked in repo A, code executed in repo B. Currently untested end-to-end but the plumbing exists.

2. Parallel Block Execution

Currently FORGE processes one block per invocation. Independent blocks (no blocked_by edges between them) could be executed in parallel:
  • FORGE identifies all ready blocks instead of just the first
  • Spawns a TOMMY instance per block
  • Aggregates results before closure
Requires either n8n’s parallel execution support or an external scheduler.

3. Pipeline.yaml as Persistent Execution State

TOMMY Path A reads pipeline.yaml but never writes back. Completed task-lets should update the pipeline state, enabling:
  • Idempotent re-runs — Skip already-completed task-lets
  • Progress tracking — See what’s done without checking GitHub
  • Resume after failure — Re-trigger and only execute remaining tasks

4. ZOID Council Integration

The planned 8-agent orchestration system adds human-in-the-loop gates at strategic points: Three ZOID agents participate:
  • Atlas (Chief of Staff) — Decides block priority and execution order
  • Anvil (CodeSmith) — Reviews code-level decisions at the block gate
  • Sentinel (Ops) — Validates deployment safety at the closure gate
Communication via Discord emoji reactions — no typing required. Aligns with ADHD-optimised interaction patterns.

5. Cost Tracking & Budget Enforcement

TOMMY already tracks token_budget in its response. Expansion:
  • Set per-block and per-epic budget limits in the webhook payload
  • TOMMY enforces budget mid-execution: if tokens exceed limit, stop and report
  • Integration with Anthropic usage API for accurate spend tracking
  • Aggregate cost reporting across blocks and epics

6. Observability Dashboard

n8n execution history provides raw data. Integration with Grafana/Prometheus:
  • Execution duration per phase
  • Success/failure rate per repo
  • Token spend per block
  • SSH command latency distribution
  • GitHub API call volume

7. A2A (Agent-to-Agent) Protocol

The current FORGE→TOMMY handshake uses a custom webhook JSON contract. Adopting a formal A2A protocol would enable:
  • Third-party agents to participate in the pipeline
  • Standard capability negotiation
  • Structured task delegation with SLA contracts
  • Cross-platform interoperability

The Autonomy Ladder

LevelDescriptionHuman RoleStatus
L1Manual mode — prompt returnedWrite the codeLive
L2API mode — Claude API executesReview resultsLive
L3TOMMY mode — SSH + CLI executesMonitor failuresLive
L4ZOID gates — Discord emoji approvalApprove at gatesPlanned
L5Full loop — re-trigger on successSleepFuture

Trigger Interface

Via Rover UI

The Rover console at rover.so1.io provides dedicated forms:
  • ForgeForm — auto-detected when webhook path contains “forge”
  • TommyForm — auto-detected when webhook path contains “atomiser”
FORGE fields: GitHub Org, Repository, Epic Issue #, Block Override (optional), Execution Mode (manual/api/tommy), SSH Host (conditional), Project Path (conditional).

Via curl

Manual mode (get the prompt back):
curl -X POST https://hab.so1.io/webhook/forge-resolve-next-ticket \
  -H "Content-Type: application/json" \
  -d '{
    "gh_org": "traceo-ai",
    "gh_repo": "traceo-mcp-server",
    "epic_issue_number": 22,
    "execution_mode": "manual"
  }'
TOMMY mode (full autonomous execution):
curl -X POST https://hab.so1.io/webhook/forge-resolve-next-ticket \
  -H "Content-Type: application/json" \
  --max-time 600 \
  -d '{
    "gh_org": "traceo-ai",
    "gh_repo": "traceo-mcp-server",
    "epic_issue_number": 22,
    "execution_mode": "tommy",
    "ssh_host": "root@vps.devarno.cloud",
    "project_path": "/root/projects/traceo-mcp-server"
  }'
TOMMY standalone (direct, skip FORGE):
curl -X POST https://hab.so1.io/webhook/atomiser \
  -H "Content-Type: application/json" \
  -d '{
    "repo": "traceo-ai/traceo-mcp-server",
    "branch": "main",
    "dry_run": true,
    "project_path": "/root/projects/traceo-mcp-server",
    "ssh_host": "root@vps.devarno.cloud"
  }'

Troubleshooting

Fetch Epic Issue returns 404

Most common cause: Field name mismatch. The webhook payload must use epic_issue_number (not epic_number, not epic_issue). Check Set Configuration outputs in the execution trace — if epic_issue is NaN or undefined, the field name is wrong.
# Verify the field name:
curl -s -X POST https://hab.so1.io/webhook/forge-resolve-next-ticket \
  -H "Content-Type: application/json" \
  -d '{"gh_org":"traceo-ai","gh_repo":"traceo-mcp-server","epic_issue_number":22,"execution_mode":"manual"}'

Credentials detach after API update

n8n’s PUT API strips credential bindings from all nodes. After any programmatic workflow update:
  1. Open the workflow in n8n editor at hab.so1.io
  2. Re-select the credential on every HTTP Request and SSH node
  3. Save the workflow via the editor (not the API)
See n8n Credential Rebinding Protocol for the full procedure.

TOMMY timeout

The timeout chain has multiple layers:
LayerDefaultConfigurable?
Rover UI600s (tommy), 120s (manual/api)In ForgeForm component
Vercel serverless300s (maxDuration)In route.ts
BFF proxy120s defaultPer-request timeoutMs
n8n internal600sWorkflow settings
SSH per task-letUnlimited--max-turns on Claude CLI
If the Vercel layer times out, the n8n execution continues. Check hab.so1.io/api/v1/executions/{id}?includeData=true for the actual result.

No ready blocks found

FORGE returns NO_READY_BLOCKS when all blocks are Done, Blocked, or InProgress. Check:
  • Are all blocks closed? (The epic may be complete)
  • Are dependency edges (blocked_by) correct in the block issues?
  • Is there a circular dependency? (Block A blocked by B, B blocked by A)

TOMMY returns PARTIAL_FAILURE

Check the failures array in the response for per-task-let errors:
  • SSH connection refused: Verify ssh_host is reachable and SSH key is configured in n8n
  • Claude CLI not found: Ensure Claude Code is installed on the target server at the expected path
  • project_path doesn’t exist: Verify the repo is cloned at the specified path on the SSH host
  • Exit code 1: Claude CLI encountered an error — check the error field for the Claude output
Use the provided retry_command to re-execute only the failed task-lets.

SplitInBatches infinite loop

If you see execution times growing unboundedly or node execution counts in the hundreds, check for options.reset: true on any SplitInBatches node. With reset: true, each loop-back item replaces the batch queue — causing the first item to reprocess forever. Set reset: false or remove from the execution path.

Companion Artifacts

ArtifactTypeID / Link
FORGE Workflow JSONLocal file/FORGE Resolve-Next-Ticket.json (untracked)
TOMMY Workflow JSONLocal file/TOMMY.json (untracked)
FORGE Skill FileSkill.opencode/skills/FORGE.skill.md
TOMMY Skill FileSkill.opencode/skills/TOMMY.skill.md
Resolve-next-ticket workflow gistctx:protoa30390ef
Taskset Template EngineGistc30ea47a
FORGE A2A HandbookGistdc28f6be
OpenClaw + FORGE IntegrationDocsstandalone/zoid/docs/system/OPENCLAW-FORGE-INTEGRATION.md