Claude Code Can Spawn Sub-Agents. Here's How the Coordinator Pattern Actually Works.

#ai #agents #architecture #programming

Most people think Claude Code is a single AI that reads your command and responds. It's not. Under the hood, it can spawn independent sub-Agents that work in parallel, each with their own context window, tools, and lifecycle.

I discovered this while reverse-engineering the v2.1.88 source code. The multi-Agent system is one of the most sophisticated parts of the architecture, and it solves problems that every AI agent builder will eventually face.

The Three Walls of a Single Agent

Before we get to the solution, let's understand the problem. A single Agent hits three walls when tasks get complex:

Wall 1: Context window ceiling. Even with the compression system I covered in my previous article, the context window is finite. When one Agent juggles research, coding, AND testing, all that intermediate output competes for limited attention.

Wall 2: Serial execution. One Agent can only do one thing at a time. "Research SSR approaches for React, Vue, and Svelte simultaneously" means three sequential investigations, multiplying wait time.

Wall 3: Cognitive interference. When an Agent handles multiple tasks in the same context, mental residue from task A degrades reasoning quality on task B. It's like getting pulled into a meeting mid-coding — you need time to "get back in the zone."

The Coordinator Pattern

Claude Code solves this with a Coordinator pattern. The main Agent can switch into Coordinator mode, where it stops doing work itself and starts directing others.

                 ┌──────────────┐
                 │  Coordinator │  ← Decomposes tasks, assigns work
                 │  (Main Agent) │
                 └──────┬───────┘
                        │
          ┌─────────────┼─────────────┐
          │             │             │
    ┌─────┴─────┐ ┌────┴─────┐ ┌────┴──────┐
    │  Worker A  │ │ Worker B │ │ Worker C  │
    │ (Research) │ │(Implement)│ │ (Verify)  │
    └───────────┘ └──────────┘ └───────────┘
          ↓             ↓             ↓
    Independent    Independent    Independent
      Context        Context        Context

Each Worker gets its own context window (no competition), can run in parallel (no queuing), and focuses on a single objective (no interference).

The Four-Phase Workflow

The Coordinator follows a strict four-phase protocol:

Phase 1: Research. Launch multiple Workers in parallel for independent investigation. Three Workers can simultaneously research three approaches without stepping on each other.

Phase 2: Synthesis. The Coordinator reads ALL research results, deeply understands them, and produces a specification. It can't be lazy and pass raw research to the implementer — that would create a telephone game of information loss.

Phase 3: Implementation. One Worker at a time. Multiple Workers modifying code concurrently would cause merge conflicts. This is a pragmatic trade-off: research parallelizes well, implementation doesn't.

Phase 4: Verification. A brand-new, independent Worker. Not a continuation of the implementation Worker. Like code review — a fresh pair of eyes catches what the author misses.

The Toolbox Constraint That Makes It Work

Here's the most elegant design decision. In Coordinator mode, the main Agent's available tools are:

COORDINATOR_MODE_ALLOWED_TOOLS = {
  'Agent',           // Spawn Workers
  'TaskStop',        // Stop a running Worker  
  'SendMessage',     // Message a specific Worker
  'SyntheticOutput'  // Structured output
}

Notice what's missing: Read, Write, Edit, Bash — all the hands-on tools. The Coordinator literally cannot read files, write code, or execute commands. It can only work through its Workers.

This is a forcing function. By removing the option to "just do it myself," the system ensures the Coordinator stays in its lane as a manager. Like a good engineering manager who doesn't write the code themselves.

The Iron Rules

The Coordinator's system prompt includes rules that prevent common LLM failure modes in multi-Agent scenarios:

Never thank a Worker. Completion notifications are system signals, not conversations. "Thanks for your research" wastes tokens.
Never predict a Worker's results. Before a Worker finishes, don't guess its output.
Never use one Worker to check on another. Workers send automatic completion notifications — no need to "check in."
Use SendMessage to continue, Agent to create. Don't keep spawning new Workers for the same task.

These seem obvious, but they address real failure modes. LLMs naturally want to be polite (wasting tokens on thank-yous), anticipate outcomes (hallucinating results), and create redundant work. The rules counteract these tendencies.

Three Execution Paths

When the Coordinator spawns a Worker, there are three paths:

Synchronous (default): The parent blocks and waits. Simple, for quick subtasks. But it has a clever progressive backgrounding design — after 2 seconds, it shows a "can be backgrounded" hint. After 2 minutes, it auto-backgrounds. No decision paralysis about whether a task will be quick or slow.

Asynchronous: Fire and forget. The Worker runs independently, and when done, sends a structured notification back to the Coordinator:

<task-notification>
  <task-id>a8f4x2p1</task-id>
  <status>completed</status>
  <summary>Research complete, found 3 viable approaches</summary>
  <usage>
    <total_tokens>15234</total_tokens>
    <duration_ms>45000</duration_ms>
  </usage>
</task-notification>

Remote: The Worker runs in the cloud, not locally. Solves resource constraints but adds latency. Always async.

Why This Matters for Agent Builders

If you're building AI agents, the Coordinator pattern is worth studying closely:

Context isolation is a feature, not a bug. Giving each Worker its own window prevents cross-task interference.
Constrain the manager. Removing direct tools from the Coordinator forces proper delegation.
Research parallelizes, implementation doesn't. Don't try to parallelize everything.
Fresh eyes for verification. Never let the builder verify their own work.

These aren't Claude-specific patterns. They apply to any multi-Agent system.

These patterns come from my deep-dive into Claude Code's actual source code (v2.1.88). I wrote 12 chapters covering the complete architecture — from the core loop to multi-agent coordination.

📖 Read Chapter 1 free — "What Is an AI Agent? From ChatBot to Claude Code"

If you like it, the full book is available with 50%% off for early readers:

📘 Claude Code from the Inside Out (English) — use code LAUNCH50 for $4.99
📕 深入浅出 Claude Code (中文) — use code LAUNCH50CN for $4.99

Previously in this series: