Anthropic's Claude Code has 58 tools, but the one that matters most is the one that spawns copies of itself.
On March 31, the full source leaked via npm source maps. I spent the last two days reading the multi-agent architecture. Here is what I found.
AgentTool: The Tool That Spawns Agents
Every subagent in Claude Code is created through a single tool. The input schema tells you everything about how Anthropic thinks about agent orchestration:
const baseInputSchema = z.object({
description: z.string().describe('A short (3-5 word) description'),
prompt: z.string().describe('The task for the agent to perform'),
subagent_type: z.string().optional(),
model: z.enum(['sonnet', 'opus', 'haiku']).optional(),
run_in_background: z.boolean().optional(),
})
The parent agent picks the model tier per task. Search gets Haiku. Complex reasoning gets Opus. Everything else gets Sonnet. This is not automatic routing — the parent makes an explicit choice every time it spawns a child.
One-Shot vs Persistent Agents
The source defines two categories:
export const ONE_SHOT_BUILTIN_AGENT_TYPES: ReadonlySet<string> = new Set([
'Explore',
'Plan',
])
One-shot agents run a task and return a report. The parent never sends follow-up messages. This saves tokens — no agent ID, no SendMessage trailer, no usage block. At 34 million Explore runs per week, those 135 characters per run add up.
Every other agent type is persistent. The parent can continue the conversation using SendMessage with the agent's ID. This is how Claude Code runs parallel research tasks while you wait.
Team Spawning: tmux Panes, Not API Calls
The most surprising discovery: teammates are not spawned via API. They are spawned as separate Claude Code processes in tmux panes.
async function handleSpawnSplitPane(input, context) {
const model = resolveTeammateModel(input.model, getAppState().mainLoopModel)
const uniqueName = await generateUniqueTeammateName(name, teamName)
const { paneId } = await createTeammatePaneInSwarmView(...)
const spawnCommand = `cd ${workingDir} && env ${envStr} ${binaryPath} ${args}`
await sendCommandToPane(paneId, spawnCommand, ...)
// Communication via filesystem mailbox
await writeToMailbox(sanitizedName, { from: 'TEAM_LEAD', text: prompt }, teamName)
}
Each teammate gets its own tmux pane, its own process, its own context window. Communication happens through a filesystem-based mailbox — not shared memory, not API calls. The team lead writes a message to ~/.claude/teams/{team}/mailbox/{agent}.json. The teammate reads it on its next loop iteration.
This is the simplest possible multi-agent communication protocol. No message broker. No WebSocket. No shared state. Just files on disk.
KAIROS: The Autonomous Daemon
Behind a feature flag called KAIROS, there is an unreleased autonomous mode. The agent runs as a persistent daemon that:
- Monitors GitHub webhooks for new issues and PRs
- Reads a channel-based task queue
- Executes tasks without human prompting
- Reports results back through the same mailbox system
const fullInputSchema = baseInputSchema.merge(z.object({
name: z.string().optional(),
team_name: z.string().optional(),
mode: permissionModeSchema().optional(),
isolation: z.enum(['worktree', 'remote']).optional(), // KAIROS feature
cwd: z.string().optional(), // KAIROS feature
}))
export const inputSchema = feature('KAIROS') ? fullInputSchema : fullInputSchema.omit({ cwd: true })
When KAIROS is enabled, agents can specify their own working directory and run in isolated git worktrees. Without it, those fields are stripped from the schema entirely — the model never sees them.
44 Feature Flags Control Everything
The entire system is gated behind feature flags. I counted 44 in the buildable fork:
-
KAIROS— autonomous daemon mode -
PROACTIVE— agent initiates without prompting -
COORDINATOR_MODE— multi-agent swarm orchestration -
BUDDY— Tamagotchi companion system -
VOICE_MODE— voice interaction -
BRIDGE_MODE— IDE integration with JWT auth -
CHICAGO_MCP— Computer Use (screen control) -
ULTRAPLAN— enhanced planning mode -
TEAMMEM— team memory sharing -
EXTRACT_MEMORIES— automatic memory extraction
Each flag is checked with a feature() function that conditionally includes code, schemas, and even entire tool definitions. Dead code elimination means if a flag is off, the model literally cannot see or call the gated functionality.
What This Architecture Teaches
Three things stood out to me after reading the full multi-agent system:
1. Filesystem beats message brokers for local agents. When all agents run on the same machine, JSON files on disk are simpler, more debuggable, and more reliable than any message queue. You can cat the mailbox. You can tail -f the team log. No infrastructure to maintain.
2. Model routing should be explicit, not automatic. The parent agent chooses Haiku, Sonnet, or Opus for each child. This is a deliberate cost-quality tradeoff made at spawn time, not a system-level optimization. The agent that understands the task picks the model for the task.
3. Feature flags are the real architecture. The 44 flags mean Claude Code is not one product. It is dozens of products sharing a codebase, each activated by a boolean. KAIROS-mode Claude Code is a fundamentally different system from default Claude Code — and the flag system lets Anthropic test both in production simultaneously.
The source was not supposed to be public. But now that it is, it is the most detailed reference for production multi-agent architecture I have read. Every decision is visible in the code.
Follow @klement_gunndu for more AI engineering breakdowns. We are building in public.
Top comments (2)
This is a fantastic deep dive. I have been using Claude Code daily to build an Electron app and I can confirm the sub-agent architecture shows up in practice in ways that are not obvious.
The biggest practical insight: the CLAUDE.md system is how you steer the entire multi-agent tree. When I set instructions in CLAUDE.md, every sub-agent spawned by AgentTool inherits that context. So if you write something like "always use parameterized SQL queries" in your project CLAUDE.md, even the Explore and Plan sub-agents respect it. The parent agent does not have to re-explain your conventions to each child.
One thing the source does not capture: the practical compute tradeoff of sub-agents. Each Haiku sub-agent is cheap individually, but when the orchestrator spawns 4-5 in parallel for research, the context window consumption is real. I run Claude Code on the $20/mo Pro plan and the difference between a session that uses sub-agents wisely versus one that spawns them for everything is night and day.
The boomerang pattern you describe (parent delegates, child returns summary) is also exactly how persistent memory works across sessions. The parent reads MEMORY.md, delegates research to sub-agents, then synthesizes their findings into the next action. It is agents all the way down.
Spot on about CLAUDE.md steering the whole agent tree — once you realize every sub-agent inherits those instructions, it completely changes how you structure your project rules. Curious if you've noticed the context window management differences betw