After I published the Tool Use guide, a comment came in fairly quickly: "I get single agents now, but how do I run a code reviewer, security scanner, and doc writer at the same time?" I was actually mid-experiment at that point myself.
Installing claude-agent-sdk 0.2.82 directly, I found the answer. One AgentDefinition dataclass and the ClaudeAgentOptions.agents dict is all it takes. I created the objects and explored the type structure hands-on. No API key meant I couldn't run actual queries, but the code structure and type system were fully testable.
This post is that exploration.
Where Single Agents Hit a Wall
The Tool Use loop is powerful. But there are three situations where it shows limits.
Context contamination. When a single agent handles code quality, security vulnerabilities, and test coverage in one PR review, the context window fills with intermediate results from all three tasks mixed together. The agent sees its earlier reasoning while forming later judgments — the fact that it spotted a code smell early can subtly shade the security analysis.
No parallelism. Code review takes 30 seconds, security scan 20 seconds, doc generation 25 seconds. Single agent: 75 seconds. Three concurrent agents: 30 seconds. There's no reason to run independent tasks serially.
Role bleed. An agent that "thinks like a reviewer then thinks like a security expert" does both jobs worse than dedicated specialists. This is true for human teams too.
The subagent pattern solves these three problems structurally.
Installing claude-agent-sdk 0.2.82 — SDK Structure I Verified Directly
pip install claude-agent-sdk
Version confirmed after install:
Successfully installed claude-agent-sdk-0.2.82
Running dir(claude_agent_sdk) in a temporary sandbox, the subagent-relevant classes that stood out:
import claude_agent_sdk as sdk
sdk.AgentDefinition # Subagent configuration dataclass
sdk.ClaudeAgentOptions # Full options including agents dict
sdk.TaskBudget # Token budget control
sdk.SubagentStartHookInput # Hook for subagent start events
sdk.SubagentStopHookInput # Hook for subagent stop events
sdk.list_subagents # List subagents in a session
sdk.get_subagent_messages # Retrieve a specific subagent's messages
I read the AgentDefinition source directly with inspect.getsource(). This is the actual dataclass in 0.2.82:
@dataclass
class AgentDefinition:
description: str # How the orchestrator identifies this agent
prompt: str # Subagent system prompt
tools: list[str] | None = None
disallowedTools: list[str] | None = None
model: str | None = None # "sonnet", "opus", "haiku", "inherit", or full model ID
skills: list[str] | None = None
memory: Literal["user", "project", "local"] | None = None
mcpServers: list[str | dict[str, Any]] | None = None
initialPrompt: str | None = None
maxTurns: int | None = None # Max loop count for this subagent
background: bool | None = None
effort: EffortLevel | int | None = None
permissionMode: PermissionMode | None = None
One thing I noticed in the tools field comment: "Deprecated: passing 'Skill' here is deprecated; use skills instead." I hadn't seen that in the documentation. The separate skills field is the right place now.
Defining Subagents with AgentDefinition — A PR Review Pipeline
Here's the actual code. A PR auto-review pipeline needs three roles:
import asyncio
import claude_agent_sdk as sdk
# Define each role as a specialized subagent
code_reviewer = sdk.AgentDefinition(
description="Python code quality and design review specialist",
prompt=(
"You're a Python senior engineer with 10 years of experience. "
"Review code quality, readability, and design patterns. "
"Provide concrete improvement suggestions in markdown format."
),
tools=["Read", "Grep"],
model="sonnet",
maxTurns=8,
)
security_scanner = sdk.AgentDefinition(
description="Security vulnerability scanner — injection risks, exposed secrets, unsafe operations",
prompt=(
"You're a security engineer. "
"Find SQL injection risks, hardcoded secrets, unsafe eval/exec, "
"and permission issues. Report each with severity level."
),
tools=["Read", "Grep", "Bash"],
model="sonnet",
maxTurns=6,
)
doc_writer = sdk.AgentDefinition(
description="Docstring and README writer — reads code and generates clear documentation",
prompt=(
"You're a technical writer. "
"Write Google Style docstrings for functions and classes, "
"and create usage examples for the README."
),
tools=["Read", "Write", "Edit"],
model="haiku", # haiku is sufficient for docs and cheaper
maxTurns=5,
)
# Orchestrator options
opts = sdk.ClaudeAgentOptions(
system_prompt=(
"You're a PR review orchestrator. "
"Call code-reviewer, security-scanner, and doc-writer in parallel "
"and compile a comprehensive review report from all results."
),
allowed_tools=["Agent", "Read"], # Agent tool is how subagents are invoked
agents={
"code-reviewer": code_reviewer,
"security-scanner": security_scanner,
"doc-writer": doc_writer,
},
permission_mode="bypassPermissions",
)
The dict keys in ClaudeAgentOptions.agents are the names the orchestrator uses when calling subagents. When the system prompt says "call code-reviewer," Claude invokes that agent via the Agent tool.
Parallel Execution Pattern — Running Three Agents Simultaneously
The most important line in the SDK documentation:
"Multiple subagents can run concurrently. When Claude identifies independent subtasks, it spawns multiple agents simultaneously using multiple Task tool calls in a single message."
When the orchestrator calls multiple Agent tools in a single message, they run in parallel. You don't write asyncio.gather() yourself. Tell the orchestrator to "call these agents in parallel" and the SDK handles it.
Actual query flow:
async def review_pr(pr_diff: str):
results = []
async for message in sdk.query(
prompt=(
f"Review this PR diff:\n\n{pr_diff}\n\n"
"Run code-reviewer, security-scanner, and doc-writer simultaneously "
"to analyze each domain in parallel, then compile a unified review report."
),
options=opts,
):
if isinstance(message, sdk.AssistantMessage):
for block in message.content:
if hasattr(block, "text"):
results.append(block.text)
elif isinstance(message, sdk.ResultMessage):
print(f"Total cost: ${message.total_cost_usd:.4f}")
print(f"Duration: {message.duration_ms}ms")
break
return "\n".join(results)
Each subagent's context window starts fresh. From the official docs:
"A subagent's context window starts fresh, and the only channel from parent to subagent is the Agent tool's prompt string."
The orchestrator only sees the final result, not the intermediate reasoning. That's what prevents context contamination.
Controlling Costs with TaskBudget
Running three subagents concurrently doesn't just triple costs — it can amplify them unpredictably. Each agent might make redundant tool calls trying to do thorough work.
TaskBudget is the API-level fix:
opts = sdk.ClaudeAgentOptions(
# ... same as above ...
task_budget=sdk.TaskBudget(total=50000), # 50K token budget
)
Actual class structure from inspect.getsource(sdk.TaskBudget):
class TaskBudget(TypedDict):
"""API-side task budget in tokens.
When set, the model is made aware of its remaining token budget so it can
pace tool use and wrap up before the limit. Sent as
output_config.task_budget with the task-budgets-2026-03-13 beta header.
"""
total: int
The task-budgets-2026-03-13 beta header is attached automatically. The agent becomes aware of its remaining budget and decides internally when to pace down and wrap up. Much cleaner than an external timeout that forces mid-task termination.
Combine with AgentDefinition.maxTurns for a two-tier safety net:
security_scanner = sdk.AgentDefinition(
# ...
maxTurns=6, # Subagent level: max 6 tool calls
)
opts = sdk.ClaudeAgentOptions(
# ...
task_budget=sdk.TaskBudget(total=100000), # Global level: 100K token ceiling
)
Subagent Hooks — Tracking Start and Stop Events
SubagentStartHookInput and SubagentStopHookInput let you detect exactly when each subagent starts and finishes:
import time
agent_timings: dict[str, float] = {}
def on_agent_start(hook_input: sdk.SubagentStartHookInput) -> None:
agent_timings[hook_input.agent_id] = time.time()
print(f"▶ {hook_input.agent_type} started (id: {hook_input.agent_id[:8]})")
def on_agent_stop(hook_input: sdk.SubagentStopHookInput) -> None:
start = agent_timings.get(hook_input.agent_id, time.time())
elapsed = time.time() - start
print(f"■ {hook_input.agent_type} done ({elapsed:.1f}s)")
# hook_input.agent_transcript_path has the full subagent conversation log
opts = sdk.ClaudeAgentOptions(
# ...
hooks={
"SubagentStart": [sdk.HookMatcher(hook_callback=on_agent_start)],
"SubagentStop": [sdk.HookMatcher(hook_callback=on_agent_stop)],
},
)
agent_transcript_path from SubagentStopHookInput is invaluable for production debugging. If a subagent produces unexpected output, that's where you look first.
Also worth knowing: multiple hook matchers on the same event run concurrently, not sequentially. The docs explicitly state this. Design each hook to be independent.
When to Use Subagents (and When Not To)
I want to be direct here: subagents aren't always the right choice.
Use them when:
- You have 3+ independent tasks, each taking 10+ seconds
- Different tasks need different tool access (security scanner doesn't need Write)
- You've verified that context contamination actually hurts result quality
They're overkill when:
- You have 2 tasks where task B depends on task A's output
- Total runtime would be under 5 seconds (spawn overhead exceeds benefit)
- It's a simple question-answer pattern
The same tradeoff comes up in the A2A + MCP hybrid architecture post: multi-agent structure adds complexity. More failure points, harder debugging, less predictable costs. Don't add subagents to a problem that a single agent can handle.
My personal threshold: "three or more independent tasks, each likely to consume 10K+ tokens with Opus." Below that, I stick with a single agent.
What I Couldn't Test
No API key meant I couldn't capture actual execution logs showing the three agents running in parallel. The object construction and type validation worked, but "what does the console output look like when three agents actually spawn concurrently" — I can't show you that from this run.
The fork_session function also caught my attention but didn't fit in this post. fork_session(session_id, up_to_message_id) lets you branch a session from a specific point. Useful when subagents want to try different strategies from the same base context without repeating earlier work.
Summary
The core of subagent orchestration in claude-agent-sdk 0.2.82:
-
AgentDefinition: Separate role, prompt, tools, and model per subagent -
ClaudeAgentOptions.agents: Register subagent names for the orchestrator to call -
Agenttool + parallel prompt instruction: Orchestrator spawns multiple subagents at once
Add TaskBudget and SubagentStartHookInput/SubagentStopHookInput for cost control and execution tracking.
Start with a single agent. Move to subagents when your task fits "independent, parallelizable, three or more."
References:
- Subagents in the SDK — Claude API official docs
- Building agents with the Claude Agent SDK — Anthropic engineering blog
-
claude-agent-sdk==0.2.82PyPI package — direct install and source inspection (2026-05-18)
Top comments (0)