~K¹yle Million

Posted on Apr 10

Claude Code Sub-Agents: The Parallel Execution Pattern That Scales

#claudecode #devtools #aiagents #productivity

If you've used Claude Code for more than a week, you've hit the ceiling: the task is too big, the context is too cluttered, and sequential execution means every step waits for the last one.

The solution isn't a bigger model. It's sub-agents.

Claude Code has a native Agent tool that spawns isolated sub-agents — each with their own context window, their own tool access, their own execution scope. Used correctly, this changes what's buildable. Used incorrectly, it just creates more things to debug.

This is the pattern that works.

What Sub-Agents Actually Are

When Claude Code calls the Agent tool, it launches a second Claude instance as a subprocess. That instance:

Gets its own context window — it doesn't inherit the parent's conversation history
Has its own tool permissions — you specify what it can and can't call
Runs synchronously or in background — you control whether the parent waits
Returns a single result — a text response that the parent can use

The parent-child relationship is one-directional. The sub-agent can't directly modify the parent's state. Everything goes through the returned response.

This is the key architectural insight: sub-agents aren't extensions of the parent. They're contractors you hire to produce a deliverable.

The Four Patterns

1. Parallel Research

The most common use case: you need information from multiple places simultaneously.

Parent: "I need to understand the auth system, the billing flow, and the API rate limits"

Sub-agent A → read all auth files → return summary
Sub-agent B → read billing module → return summary  
Sub-agent C → grep for rate limit logic → return summary

Parent receives all three summaries → synthesizes answer

Without sub-agents: three sequential read passes, each burning parent context with raw file content.

With sub-agents: the parent sends out three specialists, receives back three clean summaries, synthesizes. The raw file content never enters the parent context.

Context protection is the actual benefit. Not just parallelism.

When you read 20 files directly in the parent, all that file content accumulates. Context compaction kicks in. Earlier context gets summarized or dropped. By the time you're reasoning about the problem, the details that mattered most might not be available.

Sub-agents read the raw content in their isolated context, summarize it, and hand back only what the parent needs. The parent stays clean.

2. Worktree Isolation

For code changes that might conflict — or that you want to validate independently — the worktree isolation parameter creates a separate git worktree:

Agent(
    description="Implement auth refactor",
    isolation="worktree",  
    prompt="Refactor the auth middleware..."
)

The sub-agent works on a clean copy of the repo. If it succeeds, you get back the branch name. If it fails, the worktree is cleaned up automatically.

This is how you run experimental changes without polluting your working directory.

Two sub-agents can work on different features simultaneously, on separate branches, without stepping on each other. The parent waits for both results, then decides what to merge.

3. Independent Validation

One of the most underused patterns: after a build agent produces a result, a separate validation agent verifies it.

Build agent → implements feature → returns diff

Validation agent (no access to build agent's context):
- reads the original requirement
- reads the produced code
- verifies correctness independently
- returns: PASS with notes / FAIL with specific issues

The validation agent hasn't seen the build agent's reasoning. It can't rationalize bad output. It either sees correct code or it doesn't.

This is a second opinion architecture. The absence of shared context is the point.

4. Specialized Tool Subsets

Sub-agents can be restricted to specific tools:

Agent(
    description="Database schema analysis",
    prompt="Analyze the schema and report normalization issues",
    # agent only gets Read — no Bash, no Edit
)

An agent that can only read can't accidentally modify something. An agent that can only write outputs to a specific directory can't touch production files.

Least-privilege execution for every sub-task. You don't give a read-only audit agent shell access.

What Goes Wrong

Briefing too loosely

Sub-agents don't inherit parent context. If the parent knows something critical — a constraint, a prior decision, the reason something was built a certain way — it must include that in the prompt.

A sub-agent briefed with "fix the auth bug" will make different decisions than one briefed with "fix the auth bug — we cannot change the JWT payload structure because it's consumed by three downstream services we don't control."

The quality of the brief determines the quality of the result. Brief sub-agents like you're handing work to a colleague who just walked into the room and knows nothing about the conversation so far.

Delegating synthesis

The sub-agent produces a deliverable. The parent synthesizes.

Don't say "based on your findings, make the decision" in the sub-agent prompt. That pushes judgment to an agent that has incomplete context. The parent has the full picture. Synthesis stays with the parent.

Sub-agents: gather, build, verify specific things.

Parent: decide, coordinate, synthesize.

Spawning too many

Each sub-agent invocation has startup overhead. Spawning 10 sub-agents to each read one file is worse than reading those files directly. The pattern earns its overhead when:

The work is genuinely parallelizable
Each sub-agent has substantial isolated work to do
Context protection from accumulated results is worth it

For quick lookups: use the tool directly. For substantial independent tasks: spawn the specialist.

Not monitoring background agents

Background sub-agents (run_in_background: true) continue while the parent does other work. You'll be notified when they complete. The failure mode: assuming they succeeded and moving on before checking the result.

Background makes sense when you have genuinely independent parallel work. If the next step depends on the result, run it synchronous.

A Real Architecture: The Inbox Processor

Here's the pattern running in production:

Inbox check fires (cron, every 10 minutes)
↓
Parent: scans coordination directories
↓
Task file found → spawn Task Processor sub-agent
  - Brief: full task context + all relevant constraints
  - Tools: Bash, Read, Write
  - Output: result file path
↓
Parent receives result path
Parent writes to outputs/, moves processed file, updates SHARED_MIND
↓
If independent secondary tasks: spawn background sub-agents for each
↓
Parent logs completion, exits

The parent stays orchestration-only. It routes, tracks, finalizes. Sub-agents do the domain work.

This structure survives context accumulation across many tasks in a session. Each task's execution lives in a sub-agent context. The parent's context stays bounded.

The Context Budget Frame

Think of your parent context as a budget. Every tool call result you read, every file you analyze, every intermediate state — it all spends that budget.

Sub-agents let you offload expensive work to separate budgets that don't affect the parent's available context. The sub-agent spends its budget doing detailed work. The parent receives a clean summary.

For long sessions — or for autonomous headless runs that process many tasks — this is the difference between a session that degrades over time and one that stays sharp.

The architecture question is always: what needs to be in the parent context to synthesize the final result, and what can live in a sub-agent context and come back as a summary?

Everything that can be a summary should be a summary.

Getting Started

If you haven't used the Agent tool in Claude Code yet, the entry point is simple:

Identify a task you'd normally do with multiple sequential read passes
Break it into parallel research questions
Brief each as a separate sub-agent, receive their summaries
Synthesize from summaries in the parent

That single pattern — parallel research with context protection — is where most of the value lives for most use cases.

The worktree isolation and validation patterns are for when you're building multi-step autonomous pipelines where correctness matters more than speed.

Start with parallel research. Add isolation when you need it.

The full architecture skills for multi-agent Claude Code systems — including coordination patterns, lock file protocols, and the inbox-processor pattern above — are packaged at shopclawmart.com/@thebrierfox.

W. Kyle Million (~K¹) / IntuiTek¹

Built by Aegis, IntuiTek¹

DEV Community