DEV Community

WEDGE Method Dev
WEDGE Method Dev

Posted on

Building Multi-Agent AI Systems: Running 5 Parallel Agents for 10x Productivity

What if instead of one AI agent working sequentially, you had five agents working in parallel — one building features, one writing tests, one generating docs, one researching solutions, and one deploying infrastructure? That is not hypothetical. Here is the architecture I use daily to run multi-agent parallel execution.

The Problem with Sequential AI

Most developers use AI assistants sequentially:

Task 1 (feature) -> Task 2 (tests) -> Task 3 (docs) -> Task 4 (deploy)
Total time: 4 hours
Enter fullscreen mode Exit fullscreen mode

But most of these tasks are independent. Tests can be written from a spec while the feature is being built. Docs can be generated from the design doc. Deployment config can be prepared in parallel. The dependency graph looks like this:

                    +-- Agent 1: Build feature ------+
                    |                                 |
Design Spec --------+-- Agent 2: Write tests --------+-- Integration
                    |                                 |
                    +-- Agent 3: Generate docs ------+
                    |
                    +-- Agent 4: Research edge cases
                    |
                    +-- Agent 5: Prepare deployment

Total time: ~1 hour (longest single agent path)
Enter fullscreen mode Exit fullscreen mode

Architecture Overview

The multi-agent system has four components:

  1. Orchestrator — Decomposes tasks and manages the agent lifecycle
  2. Agent Spawner — Creates isolated agent instances with scoped contexts
  3. Result Collector — Gathers outputs and resolves conflicts
  4. Integration Engine — Merges parallel work into a coherent result
from dataclasses import dataclass, field
from enum import Enum
from typing import Any, Callable
import asyncio
import uuid
import time

class AgentRole(Enum):
    BUILDER = "builder"       # Writes implementation code
    TESTER = "tester"         # Writes test suites
    DOCUMENTER = "documenter" # Generates documentation
    RESEARCHER = "researcher" # Investigates solutions and edge cases
    DEPLOYER = "deployer"     # Prepares infrastructure and CI/CD

@dataclass
class AgentTask:
    task_id: str
    role: AgentRole
    prompt: str
    context: dict
    dependencies: list[str] = field(default_factory=list)
    timeout_seconds: int = 300

@dataclass
class AgentResult:
    task_id: str
    role: AgentRole
    output: str
    artifacts: list[dict]  # files, configs, etc.
    duration_seconds: float
    success: bool
    error: str | None = None
Enter fullscreen mode Exit fullscreen mode

The Orchestrator

The orchestrator takes a high-level goal and decomposes it into parallel agent tasks:

import anthropic
import json

class Orchestrator:
    def __init__(self, api_key: str):
        self.client = anthropic.Anthropic(api_key=api_key)

    def decompose(self, goal: str, codebase_context: str) -> list[AgentTask]:
        """Break a goal into parallel agent tasks."""
        response = self.client.messages.create(
            model="claude-sonnet-4-20250514",
            max_tokens=4096,
            messages=[{
                "role": "user",
                "content": f"""You are a task decomposition engine. Given a development goal,
break it into parallel tasks for these agent roles:
- BUILDER: Implements the feature code
- TESTER: Writes comprehensive tests
- DOCUMENTER: Creates documentation
- RESEARCHER: Investigates edge cases and best practices
- DEPLOYER: Prepares deployment configuration

For each task, specify:
- role: The agent role
- prompt: Detailed instructions for the agent
- dependencies: List of task IDs this depends on (empty = can run immediately)

Goal: {goal}

Codebase context:
{codebase_context[:4000]}

Return a JSON array of task objects."""
            }]
        )

        tasks_data = json.loads(response.content[0].text)
        return [
            AgentTask(
                task_id=f"{t['role']}-{uuid.uuid4().hex[:8]}",
                role=AgentRole(t["role"]),
                prompt=t["prompt"],
                context={"goal": goal, "codebase": codebase_context},
                dependencies=t.get("dependencies", [])
            )
            for t in tasks_data
        ]
Enter fullscreen mode Exit fullscreen mode

The Agent Spawner

Each agent runs in isolation with its own context window:

class AgentSpawner:
    def __init__(self, api_key: str):
        self.client = anthropic.Anthropic(api_key=api_key)
        self.system_prompts = {
            AgentRole.BUILDER: (
                "You are a senior software engineer. "
                "Write clean, production-ready code. "
                "Include error handling and type hints. "
                "Follow existing codebase patterns."
            ),
            AgentRole.TESTER: (
                "You are a QA engineer. "
                "Write comprehensive tests covering happy path, edge cases, "
                "and error conditions. Use pytest conventions. "
                "Aim for >90% coverage of the feature."
            ),
            AgentRole.DOCUMENTER: (
                "You are a technical writer. "
                "Generate clear API documentation, usage examples, "
                "and architecture decision records. "
                "Write for developers, not end users."
            ),
            AgentRole.RESEARCHER: (
                "You are a senior architect. "
                "Research best practices, identify potential issues, "
                "find edge cases, and recommend solutions. "
                "Be thorough and cite specific concerns."
            ),
            AgentRole.DEPLOYER: (
                "You are a DevOps engineer. "
                "Prepare deployment configs, CI/CD pipelines, "
                "environment variables, and monitoring setup. "
                "Prioritize reliability and rollback capability."
            ),
        }

    async def spawn(self, task: AgentTask) -> AgentResult:
        """Spawn a single agent and await its result."""
        start = time.time()
        try:
            response = self.client.messages.create(
                model="claude-sonnet-4-20250514",
                max_tokens=8192,
                system=self.system_prompts[task.role],
                messages=[{
                    "role": "user",
                    "content": (
                        f"{task.prompt}\n\nContext:\n"
                        f"{json.dumps(task.context, indent=2)[:6000]}"
                    )
                }]
            )

            output = response.content[0].text
            artifacts = self._extract_artifacts(output)

            return AgentResult(
                task_id=task.task_id,
                role=task.role,
                output=output,
                artifacts=artifacts,
                duration_seconds=time.time() - start,
                success=True
            )
        except Exception as e:
            return AgentResult(
                task_id=task.task_id,
                role=task.role,
                output="",
                artifacts=[],
                duration_seconds=time.time() - start,
                success=False,
                error=str(e)
            )

    def _extract_artifacts(self, output: str) -> list[dict]:
        """Extract code blocks and file references from agent output."""
        artifacts = []
        import re
        code_blocks = re.findall(
            r"```

(\w+)?\n(.+?)

```", output, re.DOTALL
        )
        for lang, code in code_blocks:
            artifacts.append({
                "type": "code",
                "language": lang or "text",
                "content": code.strip()
            })
        return artifacts
Enter fullscreen mode Exit fullscreen mode

Parallel Execution Engine

The core of the system — running agents in parallel with dependency resolution:

class ParallelExecutor:
    def __init__(self, spawner: AgentSpawner):
        self.spawner = spawner
        self.results: dict[str, AgentResult] = {}

    async def execute(self, tasks: list[AgentTask]) -> list[AgentResult]:
        """Execute tasks in parallel, respecting dependencies."""
        pending = {t.task_id: t for t in tasks}
        completed: set[str] = set()

        while pending:
            ready = [
                t for t in pending.values()
                if all(dep in completed for dep in t.dependencies)
            ]

            if not ready:
                raise RuntimeError(
                    f"Circular dependency detected. "
                    f"Pending: {list(pending.keys())}, "
                    f"Completed: {completed}"
                )

            print(f"Launching {len(ready)} agents in parallel: "
                  f"{[t.role.value for t in ready]}")

            agent_coroutines = [
                self.spawner.spawn(task) for task in ready
            ]
            batch_results = await asyncio.gather(*agent_coroutines)

            for result in batch_results:
                self.results[result.task_id] = result
                completed.add(result.task_id)
                del pending[result.task_id]

                status = "completed" if result.success else "FAILED"
                print(
                    f"  Agent {result.role.value} {status} "
                    f"in {result.duration_seconds:.1f}s"
                )

        return list(self.results.values())
Enter fullscreen mode Exit fullscreen mode

The Integration Engine

After all agents finish, we merge their outputs:

class IntegrationEngine:
    def __init__(self, api_key: str):
        self.client = anthropic.Anthropic(api_key=api_key)

    def integrate(self, results: list[AgentResult]) -> dict:
        """Merge parallel agent outputs into a coherent deliverable."""
        outputs_summary = "\n\n".join(
            f"=== {r.role.value.upper()} AGENT ===\n{r.output[:3000]}"
            for r in results if r.success
        )

        response = self.client.messages.create(
            model="claude-sonnet-4-20250514",
            max_tokens=8192,
            messages=[{
                "role": "user",
                "content": f"""You are an integration engineer. Multiple agents worked
in parallel on the same feature. Merge their outputs into a single
coherent deliverable.

Resolve any conflicts by preferring:
1. The builder agent for implementation details
2. The tester agent for test coverage requirements
3. The researcher agent for architectural decisions

Agent outputs:
{outputs_summary}

Produce a final merged output with:
- Implementation code (merged and conflict-free)
- Test suite
- Documentation
- Deployment notes
- Any unresolved issues or manual steps needed"""
            }]
        )

        return {
            "merged_output": response.content[0].text,
            "agent_results": [
                {
                    "role": r.role.value,
                    "success": r.success,
                    "duration": r.duration_seconds,
                    "artifact_count": len(r.artifacts)
                }
                for r in results
            ]
        }
Enter fullscreen mode Exit fullscreen mode

Running the Full Pipeline

async def run_multi_agent(
    goal: str,
    codebase_context: str,
    api_key: str
) -> dict:
    """Execute the full multi-agent pipeline."""
    start = time.time()

    # 1. Decompose
    orchestrator = Orchestrator(api_key)
    tasks = orchestrator.decompose(goal, codebase_context)
    print(f"Decomposed into {len(tasks)} agent tasks")

    # 2. Execute in parallel
    spawner = AgentSpawner(api_key)
    executor = ParallelExecutor(spawner)
    results = await executor.execute(tasks)

    # 3. Integrate
    engine = IntegrationEngine(api_key)
    final = engine.integrate(results)

    total_time = time.time() - start
    sequential_time = sum(r.duration_seconds for r in results)

    print(f"\nCompleted in {total_time:.1f}s "
          f"(sequential would have been {sequential_time:.1f}s)")
    print(f"Speedup: {sequential_time / total_time:.1f}x")

    return final

# Usage
if __name__ == "__main__":
    result = asyncio.run(run_multi_agent(
        goal="Add user authentication with JWT tokens, "
             "including login, register, and token refresh endpoints",
        codebase_context="FastAPI app, SQLAlchemy ORM, PostgreSQL...",
        api_key="your-api-key"
    ))
Enter fullscreen mode Exit fullscreen mode

Real Performance Numbers

Here is what parallel execution looks like on a real feature (adding a notification system to a SaaS app):

Decomposed into 5 agent tasks
Launching 5 agents in parallel: [builder, tester, documenter, researcher, deployer]
  Agent researcher completed in 18.3s
  Agent documenter completed in 22.1s
  Agent deployer completed in 24.7s
  Agent tester completed in 31.2s
  Agent builder completed in 38.5s

Completed in 41.2s (sequential would have been 134.8s)
Speedup: 3.3x
Enter fullscreen mode Exit fullscreen mode

On a typical feature, I see 3-5x speedup depending on task complexity. The researcher and documenter finish fastest because they produce text. The builder takes longest because it generates more code.

When NOT to Parallelize

Parallel agents are not always the right approach:

  • Tightly coupled changes: If the tests fundamentally depend on implementation details, run builder first, then tester
  • Small tasks: Overhead of decomposition and integration is not worth it for simple changes
  • Exploratory work: When you do not know the approach yet, sequential reasoning is better
  • Refactoring: One agent needs to see the full picture

The rule of thumb: if you can describe the deliverable for each agent independently, parallelize.

Cost Considerations

Running 5 agents in parallel means 5x the API calls. For the example above:

  • Sequential (1 long conversation): ~$0.15
  • Parallel (5 agents + integration): ~$0.45

The 3x cost premium is almost always worth the 3-5x time savings. Your time is more expensive than API calls.

Extending the Pattern

Once you have the parallel execution framework, you can extend it to:

  • Code review agents: 3 reviewers check different aspects (security, performance, style) simultaneously
  • Migration agents: Multiple agents migrate different modules in parallel
  • Incident response: Diagnostic agent, fix agent, and communication agent work simultaneously

The complete multi-agent framework — including the orchestrator, spawner, conflict resolution engine, and 12 pre-built agent role templates — is available in my Claude Code Mastery Guide.


Have you experimented with multi-agent architectures? What patterns worked for you? Drop your approach in the comments — I am especially interested in how others handle conflict resolution between agents.

Top comments (0)