Skip to main content

Context

The Branding Assistant pipeline relies on multiple specialized agents working in sequence to transform a YAML project definition into a complete brand kit. Rather than building a monolithic system, we decomposed the pipeline into discrete agents — each owning a single stage of the branding workflow. This document captures the operational learnings from designing, implementing, and running that multi-agent orchestration. The core challenge: how do you get five agents to cooperate reliably on a shared creative task without introducing tight coupling, losing context, or creating untraceable failures?

Agent Roles

Each agent in the pipeline has a single, well-defined responsibility. No agent knows how other agents accomplish their work — only what they receive and what they must produce.

Brand Analyst

Ingests the YAML project definition and produces the Brand Anchor — a structured, model-agnostic representation of brand meaning. Resolves personality traits, audience context, and visual direction into normalized semantics.

Prompt Engineer

Consumes the Brand Anchor and selects, interpolates, and assembles prompt modules for each required output type. Produces model-ready prompts without knowledge of which model will execute them.

Model Router

Routes each prompt to the appropriate generative model based on output type, quality tier, cost constraints, and provider availability. Manages fallback chains and retry logic.

Quality Reviewer

Evaluates raw model outputs against the Brand Anchor. Checks color fidelity, stylistic consistency, accessibility compliance, and format correctness. Gates outputs before they proceed to packaging.

Output Formatter

Post-processes approved outputs into the final brand kit. Handles format conversion, asset optimization, metadata tagging, and kit assembly. Produces the deliverable artifact.

Handoff Protocol

Agents communicate via structured message passing. Every handoff follows the same contract:
1

Structured Envelope

Each inter-agent message is wrapped in a typed envelope containing: sender ID, receiver ID, timestamp, correlation ID (tracing the original request), and the payload conforming to the receiver’s input schema.
2

Context Preservation

The Brand Anchor is attached to every message as immutable context. No agent modifies the Brand Anchor — it flows through the pipeline unchanged. This ensures all agents make decisions against the same brand semantics.
3

Schema Validation at Boundaries

Every receiving agent validates the incoming message against its expected input schema before processing. Malformed messages are rejected with structured error responses — they never cause silent failures downstream.
4

Acknowledgment & Receipt

Each agent acknowledges receipt before beginning work. The orchestrator tracks acknowledgments to detect stalled or unresponsive agents within timeout windows.

Error Handling

Failures in a multi-agent pipeline require clear escalation paths. Every agent follows the same error taxonomy:
Error ClassBehaviorExample
RecoverableAgent retries internally (up to 3 attempts with exponential backoff)Model API timeout, transient network failure
DegradedAgent produces partial output and flags it for reviewModel returns lower-resolution image than requested
FatalAgent halts and escalates to the orchestrator with full error contextSchema validation failure, authentication error
PoisonedOrchestrator quarantines the request and alerts operatorsRepeated fatal errors on the same input across retries
Retry semantics are agent-local — the orchestrator never retries on behalf of an agent. This keeps retry logic close to the failure domain (e.g., the Model Router understands rate limits, the Brand Analyst understands schema errors).

Fallback Strategies

  • Model Router: Maintains a ranked list of providers per output type. If the primary provider fails after retries, the router falls through to the next provider in the chain.
  • Quality Reviewer: If an output fails quality checks, the reviewer sends it back to the Prompt Engineer with structured feedback for prompt refinement — up to 2 revision cycles before escalating.
  • Output Formatter: If format conversion fails for a specific asset, the formatter packages remaining assets and flags the failed item — partial delivery over total failure.

Key Learnings

Single Responsibility

When an agent does one thing, it fails in one way. When it does three things, it fails in nine ways — and you can’t tell which one.
The strongest architectural decision was enforcing strict single responsibility at the agent level. Early prototypes combined the Brand Analyst and Prompt Engineer into a single “Brand Interpreter” agent. This created problems immediately:
  • Debugging prompt issues required understanding brand normalization logic
  • Testing required mocking both schema validation and prompt assembly
  • Performance bottlenecks in prompt generation blocked brand analysis for queued requests
Splitting them apart eliminated all three problems. Each agent has its own test suite, its own failure modes, and its own scaling characteristics.

Context Passing

The Brand Anchor is the universal context object — an immutable, structured representation of brand meaning that every agent can read but none can modify.
Early designs allowed agents to enrich the Brand Anchor as it moved through the pipeline. The Prompt Engineer would add “resolved prompt hints,” the Model Router would add “selected model metadata.” This seemed efficient but created two serious problems:
  1. Non-determinism: The same project definition could produce different Brand Anchors depending on which agents had processed it, making debugging nearly impossible.
  2. Coupling: Downstream agents started depending on enrichments from upstream agents, creating invisible dependencies across the pipeline.
Making the Brand Anchor immutable forced each agent to carry its own state separately. The pipeline became deterministic: same input, same Brand Anchor, same outputs.

Validation Gates

Every agent handoff passes through a validation gate. No output moves downstream without being checked against the expected schema and quality criteria.
Validation gates serve three purposes:
  1. Correctness: Malformed data is caught at the boundary, not three agents later when it causes a cryptic failure
  2. Traceability: When a defect is found in the final output, the validation logs pinpoint exactly which gate passed it through
  3. Confidence: Downstream agents can trust their inputs without defensive programming — the gate already verified them
The Quality Reviewer is the most critical gate. It evaluates model outputs against the Brand Anchor using both programmatic checks (color distance, format compliance) and heuristic scoring (style coherence, aesthetic consistency).

Operational Outcomes

OutcomeDetail
Reliable pipelineStructured error handling and validation gates mean failures are caught early and reported clearly — no silent corruption
Traceable outputsEvery brand kit can be traced back through the full agent chain: which Brand Anchor was used, which prompts were generated, which models were called, what the reviewer scored
Auditable decisionsThe immutable Brand Anchor plus logged inter-agent messages create a complete audit trail — critical for understanding why a specific brand kit looks the way it does
Independent scalingThe Model Router (the most resource-intensive agent) can be horizontally scaled without affecting upstream agents
Modular evolutionAdding new output types (motion, audio branding) requires adding a Prompt Module and potentially a new quality check — the orchestration layer remains unchanged

Campaign Trails

  • Observability dashboard: Build a real-time view of agent health, handoff latency, and quality gate pass rates across the pipeline.
  • Agent versioning: Implement semantic versioning for agent contracts so agents can be upgraded independently without breaking the pipeline.
  • Parallel execution: The Prompt Engineer currently generates prompts sequentially. For brand kits with many output types, parallel prompt generation and model routing could significantly reduce end-to-end latency.
  • Human-in-the-loop: Add an optional manual review gate after the Quality Reviewer for high-stakes brand kits (e.g., product launches) where automated quality checks alone are insufficient.