What if instead of one AI agent working sequentially, you had five agents working in parallel — one building features, one writing tests, one generating docs, one researching solutions, and one deploying infrastructure? That is not hypothetical. Here is the architecture I use daily to run multi-agent parallel execution.
The Problem with Sequential AI
Most developers use AI assistants sequentially:
Task 1 (feature) -> Task 2 (tests) -> Task 3 (docs) -> Task 4 (deploy)
Total time: 4 hours
But most of these tasks are independent. Tests can be written from a spec while the feature is being built. Docs can be generated from the design doc. Deployment config can be prepared in parallel. The dependency graph looks like this:
+-- Agent 1: Build feature ------+
| |
Design Spec --------+-- Agent 2: Write tests --------+-- Integration
| |
+-- Agent 3: Generate docs ------+
|
+-- Agent 4: Research edge cases
|
+-- Agent 5: Prepare deployment
Total time: ~1 hour (longest single agent path)
Architecture Overview
The multi-agent system has four components:
- Orchestrator — Decomposes tasks and manages the agent lifecycle
- Agent Spawner — Creates isolated agent instances with scoped contexts
- Result Collector — Gathers outputs and resolves conflicts
- Integration Engine — Merges parallel work into a coherent result
from dataclasses import dataclass, field
from enum import Enum
from typing import Any, Callable
import asyncio
import uuid
import time
class AgentRole(Enum):
BUILDER = "builder" # Writes implementation code
TESTER = "tester" # Writes test suites
DOCUMENTER = "documenter" # Generates documentation
RESEARCHER = "researcher" # Investigates solutions and edge cases
DEPLOYER = "deployer" # Prepares infrastructure and CI/CD
@dataclass
class AgentTask:
task_id: str
role: AgentRole
prompt: str
context: dict
dependencies: list[str] = field(default_factory=list)
timeout_seconds: int = 300
@dataclass
class AgentResult:
task_id: str
role: AgentRole
output: str
artifacts: list[dict] # files, configs, etc.
duration_seconds: float
success: bool
error: str | None = None
The Orchestrator
The orchestrator takes a high-level goal and decomposes it into parallel agent tasks:
import anthropic
import json
class Orchestrator:
def __init__(self, api_key: str):
self.client = anthropic.Anthropic(api_key=api_key)
def decompose(self, goal: str, codebase_context: str) -> list[AgentTask]:
"""Break a goal into parallel agent tasks."""
response = self.client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=4096,
messages=[{
"role": "user",
"content": f"""You are a task decomposition engine. Given a development goal,
break it into parallel tasks for these agent roles:
- BUILDER: Implements the feature code
- TESTER: Writes comprehensive tests
- DOCUMENTER: Creates documentation
- RESEARCHER: Investigates edge cases and best practices
- DEPLOYER: Prepares deployment configuration
For each task, specify:
- role: The agent role
- prompt: Detailed instructions for the agent
- dependencies: List of task IDs this depends on (empty = can run immediately)
Goal: {goal}
Codebase context:
{codebase_context[:4000]}
Return a JSON array of task objects."""
}]
)
tasks_data = json.loads(response.content[0].text)
return [
AgentTask(
task_id=f"{t['role']}-{uuid.uuid4().hex[:8]}",
role=AgentRole(t["role"]),
prompt=t["prompt"],
context={"goal": goal, "codebase": codebase_context},
dependencies=t.get("dependencies", [])
)
for t in tasks_data
]
The Agent Spawner
Each agent runs in isolation with its own context window:
class AgentSpawner:
def __init__(self, api_key: str):
self.client = anthropic.Anthropic(api_key=api_key)
self.system_prompts = {
AgentRole.BUILDER: (
"You are a senior software engineer. "
"Write clean, production-ready code. "
"Include error handling and type hints. "
"Follow existing codebase patterns."
),
AgentRole.TESTER: (
"You are a QA engineer. "
"Write comprehensive tests covering happy path, edge cases, "
"and error conditions. Use pytest conventions. "
"Aim for >90% coverage of the feature."
),
AgentRole.DOCUMENTER: (
"You are a technical writer. "
"Generate clear API documentation, usage examples, "
"and architecture decision records. "
"Write for developers, not end users."
),
AgentRole.RESEARCHER: (
"You are a senior architect. "
"Research best practices, identify potential issues, "
"find edge cases, and recommend solutions. "
"Be thorough and cite specific concerns."
),
AgentRole.DEPLOYER: (
"You are a DevOps engineer. "
"Prepare deployment configs, CI/CD pipelines, "
"environment variables, and monitoring setup. "
"Prioritize reliability and rollback capability."
),
}
async def spawn(self, task: AgentTask) -> AgentResult:
"""Spawn a single agent and await its result."""
start = time.time()
try:
response = self.client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=8192,
system=self.system_prompts[task.role],
messages=[{
"role": "user",
"content": (
f"{task.prompt}\n\nContext:\n"
f"{json.dumps(task.context, indent=2)[:6000]}"
)
}]
)
output = response.content[0].text
artifacts = self._extract_artifacts(output)
return AgentResult(
task_id=task.task_id,
role=task.role,
output=output,
artifacts=artifacts,
duration_seconds=time.time() - start,
success=True
)
except Exception as e:
return AgentResult(
task_id=task.task_id,
role=task.role,
output="",
artifacts=[],
duration_seconds=time.time() - start,
success=False,
error=str(e)
)
def _extract_artifacts(self, output: str) -> list[dict]:
"""Extract code blocks and file references from agent output."""
artifacts = []
import re
code_blocks = re.findall(
r"```
(\w+)?\n(.+?)
```", output, re.DOTALL
)
for lang, code in code_blocks:
artifacts.append({
"type": "code",
"language": lang or "text",
"content": code.strip()
})
return artifacts
Parallel Execution Engine
The core of the system — running agents in parallel with dependency resolution:
class ParallelExecutor:
def __init__(self, spawner: AgentSpawner):
self.spawner = spawner
self.results: dict[str, AgentResult] = {}
async def execute(self, tasks: list[AgentTask]) -> list[AgentResult]:
"""Execute tasks in parallel, respecting dependencies."""
pending = {t.task_id: t for t in tasks}
completed: set[str] = set()
while pending:
ready = [
t for t in pending.values()
if all(dep in completed for dep in t.dependencies)
]
if not ready:
raise RuntimeError(
f"Circular dependency detected. "
f"Pending: {list(pending.keys())}, "
f"Completed: {completed}"
)
print(f"Launching {len(ready)} agents in parallel: "
f"{[t.role.value for t in ready]}")
agent_coroutines = [
self.spawner.spawn(task) for task in ready
]
batch_results = await asyncio.gather(*agent_coroutines)
for result in batch_results:
self.results[result.task_id] = result
completed.add(result.task_id)
del pending[result.task_id]
status = "completed" if result.success else "FAILED"
print(
f" Agent {result.role.value} {status} "
f"in {result.duration_seconds:.1f}s"
)
return list(self.results.values())
The Integration Engine
After all agents finish, we merge their outputs:
class IntegrationEngine:
def __init__(self, api_key: str):
self.client = anthropic.Anthropic(api_key=api_key)
def integrate(self, results: list[AgentResult]) -> dict:
"""Merge parallel agent outputs into a coherent deliverable."""
outputs_summary = "\n\n".join(
f"=== {r.role.value.upper()} AGENT ===\n{r.output[:3000]}"
for r in results if r.success
)
response = self.client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=8192,
messages=[{
"role": "user",
"content": f"""You are an integration engineer. Multiple agents worked
in parallel on the same feature. Merge their outputs into a single
coherent deliverable.
Resolve any conflicts by preferring:
1. The builder agent for implementation details
2. The tester agent for test coverage requirements
3. The researcher agent for architectural decisions
Agent outputs:
{outputs_summary}
Produce a final merged output with:
- Implementation code (merged and conflict-free)
- Test suite
- Documentation
- Deployment notes
- Any unresolved issues or manual steps needed"""
}]
)
return {
"merged_output": response.content[0].text,
"agent_results": [
{
"role": r.role.value,
"success": r.success,
"duration": r.duration_seconds,
"artifact_count": len(r.artifacts)
}
for r in results
]
}
Running the Full Pipeline
async def run_multi_agent(
goal: str,
codebase_context: str,
api_key: str
) -> dict:
"""Execute the full multi-agent pipeline."""
start = time.time()
# 1. Decompose
orchestrator = Orchestrator(api_key)
tasks = orchestrator.decompose(goal, codebase_context)
print(f"Decomposed into {len(tasks)} agent tasks")
# 2. Execute in parallel
spawner = AgentSpawner(api_key)
executor = ParallelExecutor(spawner)
results = await executor.execute(tasks)
# 3. Integrate
engine = IntegrationEngine(api_key)
final = engine.integrate(results)
total_time = time.time() - start
sequential_time = sum(r.duration_seconds for r in results)
print(f"\nCompleted in {total_time:.1f}s "
f"(sequential would have been {sequential_time:.1f}s)")
print(f"Speedup: {sequential_time / total_time:.1f}x")
return final
# Usage
if __name__ == "__main__":
result = asyncio.run(run_multi_agent(
goal="Add user authentication with JWT tokens, "
"including login, register, and token refresh endpoints",
codebase_context="FastAPI app, SQLAlchemy ORM, PostgreSQL...",
api_key="your-api-key"
))
Real Performance Numbers
Here is what parallel execution looks like on a real feature (adding a notification system to a SaaS app):
Decomposed into 5 agent tasks
Launching 5 agents in parallel: [builder, tester, documenter, researcher, deployer]
Agent researcher completed in 18.3s
Agent documenter completed in 22.1s
Agent deployer completed in 24.7s
Agent tester completed in 31.2s
Agent builder completed in 38.5s
Completed in 41.2s (sequential would have been 134.8s)
Speedup: 3.3x
On a typical feature, I see 3-5x speedup depending on task complexity. The researcher and documenter finish fastest because they produce text. The builder takes longest because it generates more code.
When NOT to Parallelize
Parallel agents are not always the right approach:
- Tightly coupled changes: If the tests fundamentally depend on implementation details, run builder first, then tester
- Small tasks: Overhead of decomposition and integration is not worth it for simple changes
- Exploratory work: When you do not know the approach yet, sequential reasoning is better
- Refactoring: One agent needs to see the full picture
The rule of thumb: if you can describe the deliverable for each agent independently, parallelize.
Cost Considerations
Running 5 agents in parallel means 5x the API calls. For the example above:
- Sequential (1 long conversation): ~$0.15
- Parallel (5 agents + integration): ~$0.45
The 3x cost premium is almost always worth the 3-5x time savings. Your time is more expensive than API calls.
Extending the Pattern
Once you have the parallel execution framework, you can extend it to:
- Code review agents: 3 reviewers check different aspects (security, performance, style) simultaneously
- Migration agents: Multiple agents migrate different modules in parallel
- Incident response: Diagnostic agent, fix agent, and communication agent work simultaneously
The complete multi-agent framework — including the orchestrator, spawner, conflict resolution engine, and 12 pre-built agent role templates — is available in my Claude Code Mastery Guide.
Have you experimented with multi-agent architectures? What patterns worked for you? Drop your approach in the comments — I am especially interested in how others handle conflict resolution between agents.
Top comments (0)