One agent is powerful but limited.
Ask it to research a topic, write an article, review that article, check the code examples, and format everything for publishing. It has to do everything sequentially. When it makes a mistake in step 2, it might not catch it until step 7. It has one perspective. One "voice." One set of strengths and weaknesses.
Now imagine three specialized agents working on the same task. A research agent that searches exhaustively and compiles sources. A writing agent that takes those sources and drafts the article with a clear structure. A review agent that reads the draft critically and flags errors, gaps, and unsupported claims. Each one knows its job deeply. They check each other's work. They have different system prompts that give them different strengths.
This is how complex knowledge work actually gets done. Not one person doing everything. A team of specialists coordinated toward a shared goal.
Multi-agent systems bring this pattern to AI.
The Core Patterns
import os
import json
import time
from typing import List, Dict, Callable, Optional, Any
from dataclasses import dataclass, field
from enum import Enum
import anthropic
client = anthropic.Anthropic(api_key=os.environ.get("ANTHROPIC_API_KEY"))
class Pattern(Enum):
ORCHESTRATOR_WORKER = "orchestrator_worker"
SEQUENTIAL_PIPELINE = "sequential_pipeline"
PARALLEL_EXECUTION = "parallel_execution"
DEBATE = "debate"
CRITIC_REVIEW = "critic_review"
print("Multi-Agent Patterns:")
print()
patterns = {
"Orchestrator-Worker": {
"description": "One LLM breaks down tasks, delegates to specialized workers, aggregates results",
"best_for": "Complex tasks that can be decomposed into subtasks",
"example": "Research assistant: orchestrator delegates to researcher, writer, editor"
},
"Sequential Pipeline": {
"description": "Output of one agent becomes input to the next in a fixed chain",
"best_for": "Multi-stage transformation: draft → edit → format → publish",
"example": "Content pipeline: researcher → writer → fact-checker → publisher"
},
"Parallel Execution": {
"description": "Multiple agents work simultaneously on independent subtasks",
"best_for": "Tasks with independent components that can run concurrently",
"example": "Market research: agent A covers Asia, agent B covers Europe simultaneously"
},
"Debate/Adversarial": {
"description": "Two agents argue opposing positions, a judge evaluates and decides",
"best_for": "Decision-making, fact-checking, reducing overconfidence",
"example": "Agent A argues for approach X, Agent B argues against, judge decides"
},
"Critic-Review": {
"description": "Creator agent produces output, critic agent evaluates and gives feedback",
"best_for": "Quality assurance, catching blind spots, improving output quality",
"example": "Writer produces article, critic identifies weaknesses, writer revises"
},
}
for name, info in patterns.items():
print(f" {name}:")
print(f" {info['description']}")
print(f" Best for: {info['best_for']}")
print(f" Example: {info['example']}")
print()
Building a Base Agent Class
@dataclass
class AgentMessage:
from_agent: str
to_agent: str
content: str
message_type: str = "task"
metadata: Dict = field(default_factory=dict)
class BaseAgent:
"""Foundation agent that all specialized agents inherit from."""
def __init__(self, name: str, role: str, system_prompt: str,
model: str = "claude-3-5-haiku-20241022",
tools: List[Dict] = None):
self.name = name
self.role = role
self.system_prompt = system_prompt
self.model = model
self.tools = tools or []
self.history:List[AgentMessage] = []
def think(self, message: str,
context: List[Dict] = None,
max_tokens: int = 1000) -> str:
messages = list(context or [])
messages.append({"role": "user", "content": message})
kwargs = {
"model": self.model,
"max_tokens": max_tokens,
"system": self.system_prompt,
"messages": messages,
}
if self.tools:
kwargs["tools"] = self.tools
response = client.messages.create(**kwargs)
if response.stop_reason == "tool_use":
return self._handle_tool_use(response, messages, max_tokens)
return response.content[0].text if response.content else ""
def _handle_tool_use(self, response, messages, max_tokens):
messages.append({"role": "assistant", "content": response.content})
tool_results = []
for block in response.content:
if block.type == "tool_use":
result = self._execute_tool(block.name, block.input)
tool_results.append({
"type": "tool_result",
"tool_use_id": block.id,
"content": json.dumps(result)
})
messages.append({"role": "user", "content": tool_results})
final = client.messages.create(
model=self.model, max_tokens=max_tokens,
system=self.system_prompt, messages=messages,
tools=self.tools
)
return final.content[0].text if final.content else ""
def _execute_tool(self, tool_name: str, tool_input: Dict) -> Any:
return {"error": f"Tool {tool_name} not implemented in {self.name}"}
def __repr__(self):
return f"Agent({self.name}, role={self.role})"
print("BaseAgent class built.")
Pattern 1: Orchestrator-Worker
class OrchestratorAgent(BaseAgent):
"""Breaks down complex goals and delegates to specialized workers."""
def __init__(self, workers: List[BaseAgent]):
super().__init__(
name = "Orchestrator",
role = "coordinator",
system_prompt = f"""You are an orchestrator that delegates tasks to specialized agents.
Available workers:
{self._format_workers(workers)}
To delegate a task, respond with JSON:
{{
"delegations": [
{{"agent": "agent_name", "task": "specific task description", "priority": 1}},
...
],
"execution": "sequential" or "parallel"
}}
After receiving worker results, synthesize them into a final coherent answer."""
)
self.workers = {w.name: w for w in workers}
def _format_workers(self, workers):
return "\n".join(f"- {w.name} ({w.role}): handles {w.role} tasks"
for w in workers)
def run(self, goal: str, verbose: bool = True) -> str:
if verbose:
print(f"\n{'='*60}")
print(f"Orchestrator Goal: {goal}")
print(f"{'='*60}")
plan_prompt = f"""Goal: {goal}
Create a delegation plan. Which agents should handle which parts?
Respond with the JSON delegation format."""
plan_json = self.think(plan_prompt)
try:
plan = json.loads(plan_json)
except json.JSONDecodeError:
import re
match = re.search(r'\{.*\}', plan_json, re.DOTALL)
if match:
plan = json.loads(match.group())
else:
plan = {"delegations": [{"agent": list(self.workers.keys())[0],
"task": goal, "priority": 1}],
"execution": "sequential"}
if verbose:
print(f"\nPlan: {plan.get('execution', 'sequential')} execution")
for d in plan.get("delegations", []):
print(f" → {d['agent']}: {d['task'][:60]}")
worker_results = {}
for delegation in plan.get("delegations", []):
agent_name = delegation["agent"]
task = delegation["task"]
if agent_name in self.workers:
if verbose:
print(f"\n[{agent_name}] working on: {task[:50]}...")
result = self.workers[agent_name].think(task)
worker_results[agent_name] = result
if verbose:
print(f"[{agent_name}] done: {result[:100]}...")
synthesis_prompt = f"""Original goal: {goal}
Worker results:
{json.dumps(worker_results, indent=2)}
Synthesize these results into a single, coherent, well-structured answer."""
final_answer = self.think(synthesis_prompt)
return final_answer
research_agent = BaseAgent(
name = "Researcher",
role = "research",
system_prompt = """You are a research specialist. Your job is to find and synthesize information.
Always cite sources, be thorough, and organize findings clearly.
Present information as bullet points with key facts highlighted."""
)
writer_agent = BaseAgent(
name = "Writer",
role = "writing",
system_prompt = """You are a technical writer. Your job is to turn research into clear, engaging prose.
Write in an accessible but precise style.
Structure content with clear headings and logical flow.
Target audience: developers and data scientists."""
)
critic_agent = BaseAgent(
name = "Critic",
role = "review",
system_prompt = """You are a critical reviewer. Your job is to find flaws and gaps.
Be constructive but rigorous. Identify:
- Factual errors or unsupported claims
- Missing important information
- Unclear or confusing passages
- Structural improvements needed
Score quality 1-10 and explain your rating."""
)
orchestrator = OrchestratorAgent(
workers = [research_agent, writer_agent, critic_agent]
)
print("\nOrchestrator-Worker system ready.")
print(f"Workers: {list(orchestrator.workers.keys())}")
result = orchestrator.run(
"Explain the key differences between BERT and GPT, including their architectures, "
"training objectives, and best use cases.",
verbose=True
)
print(f"\nFinal Answer:\n{result[:500]}...")
Pattern 2: Sequential Pipeline
class Pipeline:
"""Agents run in sequence, output flows to next agent as input."""
def __init__(self, agents: List[BaseAgent], verbose: bool = True):
self.agents = agents
self.verbose = verbose
self.outputs = {}
def run(self, initial_input: str) -> str:
current = initial_input
for i, agent in enumerate(self.agents):
if self.verbose:
print(f"\n[Stage {i+1}/{len(self.agents)}] {agent.name}")
print(f" Input: {current[:80]}...")
prompt = (
f"Previous stage output:\n{current}\n\nYour task: {agent.role}"
if i > 0 else current
)
current = agent.think(prompt)
self.outputs[agent.name] = current
if self.verbose:
print(f" Output: {current[:80]}...")
return current
draft_agent = BaseAgent(
name = "Drafter",
role = "Write a first draft. Do not worry about perfection, focus on getting ideas down.",
system_prompt = "You are a first-draft writer. Write quickly and completely. Cover all the key points."
)
editor_agent = BaseAgent(
name = "Editor",
role = "Edit the draft for clarity, concision, and flow. Fix any awkward sentences.",
system_prompt = "You are a skilled editor. Improve clarity and remove redundancy while preserving meaning."
)
formatter_agent = BaseAgent(
name = "Formatter",
role = "Format the edited content with proper markdown, headers, and structure.",
system_prompt = "You are a content formatter. Add appropriate markdown formatting, headers, and bullet points."
)
pipeline = Pipeline(
agents = [draft_agent, editor_agent, formatter_agent],
verbose = True
)
print("\nSequential Pipeline: Draft → Edit → Format")
final = pipeline.run(
"Write a brief explanation of how neural networks learn through backpropagation.")
print(f"\nFinal formatted output:\n{final[:400]}...")
Pattern 3: Parallel Execution
import concurrent.futures
import threading
class ParallelAgentRunner:
"""Run multiple agents simultaneously on independent subtasks."""
def __init__(self, agents_and_tasks: List[tuple],
max_workers: int = 4, verbose: bool = True):
self.agents_and_tasks = agents_and_tasks
self.max_workers = max_workers
self.verbose = verbose
self._lock = threading.Lock()
def run(self) -> Dict[str, str]:
results = {}
start = time.time()
def run_agent(agent_task_pair):
agent, task = agent_task_pair
if self.verbose:
with self._lock:
print(f" → [{agent.name}] started: {task[:50]}...")
result = agent.think(task)
if self.verbose:
with self._lock:
print(f" ✓ [{agent.name}] done ({time.time()-start:.1f}s)")
return agent.name, result
with concurrent.futures.ThreadPoolExecutor(
max_workers=self.max_workers
) as executor:
futures = {executor.submit(run_agent, pair): pair
for pair in self.agents_and_tasks}
for future in concurrent.futures.as_completed(futures):
name, result = future.result()
results[name] = result
elapsed = time.time() - start
if self.verbose:
print(f"\nAll agents completed in {elapsed:.1f}s total")
return results
asia_agent = BaseAgent("Asia_Researcher", "researcher",
"You research the Asian tech market. Focus on China, Japan, South Korea, India.")
europe_agent = BaseAgent("Europe_Researcher", "researcher",
"You research the European tech market. Focus on UK, Germany, France, Nordics.")
us_agent = BaseAgent("US_Researcher", "researcher",
"You research the US tech market. Focus on Silicon Valley, NYC, emerging hubs.")
topic = "the adoption and trends in AI/ML technology in 2024"
parallel_runner = ParallelAgentRunner(
agents_and_tasks = [
(asia_agent, f"Research {topic} in Asia"),
(europe_agent, f"Research {topic} in Europe"),
(us_agent, f"Research {topic} in the United States"),
],
verbose = True
)
print("\nParallel Execution: 3 regional researchers running simultaneously")
parallel_results = parallel_runner.run()
synthesizer = BaseAgent(
name = "Synthesizer",
role = "synthesis",
system_prompt = "You synthesize multiple research reports into one coherent global overview."
)
global_report = synthesizer.think(
f"Synthesize these regional research reports into a global overview:\n\n" +
"\n\n".join(f"=== {name} ===\n{result}"
for name, result in parallel_results.items())
)
print(f"\nGlobal synthesis:\n{global_report[:400]}...")
Pattern 4: Debate Agent
class DebateSystem:
"""Two agents argue opposing sides, a judge evaluates."""
def __init__(self, model: str = "claude-3-5-haiku-20241022"):
self.proposer = BaseAgent(
name = "Proposer",
role = "advocate",
system_prompt = """You are an advocate for the proposition.
Make the strongest possible case FOR the position you are assigned.
Use evidence, logic, and compelling arguments. Be persuasive.""",
model=model
)
self.opponent = BaseAgent(
name = "Opponent",
role = "critic",
system_prompt = """You are a critic of the proposition.
Make the strongest possible case AGAINST the position presented.
Find flaws, gaps, counterexamples, and alternative views. Be rigorous.""",
model=model
)
self.judge = BaseAgent(
name = "Judge",
role = "arbitrator",
system_prompt = """You are an impartial judge evaluating a debate.
Assess both sides fairly. Identify the strongest arguments from each side.
Make a reasoned final verdict with clear justification.
Format: [FOR arguments] [AGAINST arguments] [Verdict] [Reasoning]""",
model=model
)
def debate(self, proposition: str, rounds: int = 2,
verbose: bool = True) -> Dict:
if verbose:
print(f"\nDebate: '{proposition}'")
print("=" * 60)
context_p = []
context_o = []
for round_num in range(1, rounds + 1):
if verbose:
print(f"\n--- Round {round_num} ---")
prop_arg = self.proposer.think(
f"Round {round_num}: Argue FOR: '{proposition}'",
context=context_p
)
context_p.append({"role": "assistant", "content": prop_arg})
if verbose:
print(f"FOR: {prop_arg[:150]}...")
opp_arg = self.opponent.think(
f"Round {round_num}: Counter this argument against '{proposition}':\n{prop_arg}",
context=context_o
)
context_o.append({"role": "assistant", "content": opp_arg})
if verbose:
print(f"AGAINST: {opp_arg[:150]}...")
context_p.append({"role": "user",
"content": f"Opponent says: {opp_arg}"})
context_o.append({"role": "user",
"content": f"Proposer says: {prop_arg}"})
all_args = "\n\n".join([
f"FOR:\n{context_p[i]['content']}"
for i in range(0, len(context_p), 2)
] + [
f"AGAINST:\n{context_o[i]['content']}"
for i in range(0, len(context_o), 2)
])
verdict = self.judge.think(
f"Proposition: '{proposition}'\n\nDebate arguments:\n{all_args}\n\nDeliver your verdict.")
if verbose:
print(f"\nJudge's Verdict:\n{verdict[:300]}...")
return {
"proposition": proposition,
"for_arguments": [context_p[i]["content"] for i in range(0, len(context_p), 2)],
"against_arguments": [context_o[i]["content"] for i in range(0, len(context_o), 2)],
"verdict": verdict
}
debate = DebateSystem()
result = debate.debate(
proposition = "Large Language Models will replace most software engineering jobs within 10 years",
rounds = 1,
verbose = True
)
Pattern 5: Critic-Review Loop
class CriticReviewLoop:
"""Creator produces, critic evaluates, loop until quality threshold met."""
def __init__(self, creator: BaseAgent, critic: BaseAgent,
max_iterations: int = 3, quality_threshold: float = 8.0):
self.creator = creator
self.critic = critic
self.max_iterations = max_iterations
self.quality_threshold = quality_threshold
def run(self, task: str, verbose: bool = True) -> Dict:
history = []
feedback = ""
for iteration in range(1, self.max_iterations + 1):
if verbose:
print(f"\n--- Iteration {iteration} ---")
creation_prompt = (
f"{task}\n\nFeedback from previous attempt:\n{feedback}\nImprove accordingly."
if feedback else task
)
content = self.creator.think(creation_prompt)
history.append({"iteration": iteration, "content": content})
if verbose:
print(f"[{self.creator.name}]: {content[:120]}...")
critique = self.critic.think(
f"Evaluate this content (score 1-10 and feedback):\n\n{content}"
)
if verbose:
print(f"[{self.critic.name}]: {critique[:120]}...")
import re
score_match = re.search(r'\b([0-9]|10)\b', critique)
score = float(score_match.group()) if score_match else 7.0
if score >= self.quality_threshold:
if verbose:
print(f"\n✓ Quality threshold reached (score={score})")
break
feedback = critique
return {
"final_content": content,
"iterations": iteration,
"history": history
}
code_writer = BaseAgent(
name="CodeWriter", role="code_creator",
system_prompt="You write clean, well-documented Python code. Include docstrings and type hints.")
code_reviewer = BaseAgent(
name="CodeReviewer", role="code_critic",
system_prompt="""You review Python code rigorously. Check for:
- Correctness and edge cases
- Code clarity and documentation
- PEP 8 compliance
- Error handling
Score 1-10 and give specific actionable feedback.""")
review_loop = CriticReviewLoop(
creator = code_writer,
critic = code_reviewer,
max_iterations = 3,
quality_threshold = 8.0
)
print("\nCritic-Review Loop: write and improve code iteratively")
result = review_loop.run(
"Write a Python function that finds the longest palindrome substring in a string.")
print(f"\nFinal code after {result['iterations']} iteration(s):")
print(result["final_content"][:400])
When Multi-Agent Adds Real Value
print("\nWhen to Use Multi-Agent Systems:")
print()
use_cases = {
"Use multi-agent when": [
"Tasks naturally decompose into specialized subtasks",
"Quality requires multiple independent perspectives",
"Parallel execution would save significant time",
"Different parts of the task need different 'personalities' or constraints",
"One agent's output quality is not good enough and critique helps",
"Tasks exceed a single context window",
],
"Stick with single agent when": [
"Task is straightforward and fits one context window",
"Coordination overhead would outweigh the benefits",
"You need predictable, debuggable behavior",
"Latency is critical (multi-agent adds round trips)",
"Budget is tight (each agent call costs tokens)",
"You are still prototyping (complexity kills iteration speed)",
],
}
for category, points in use_cases.items():
print(f" {category}:")
for point in points:
print(f" {'✓' if 'Use' in category else '✗'} {point}")
print()
Reference Links
print("Essential Multi-Agent Reference Links:")
print()
refs = {
"Papers": [
("Society of Mind (Minsky, 1986)", "en.wikipedia.org/wiki/Society_of_Mind"),
("LLM-based Multi-Agent Survey", "arxiv.org/abs/2402.01680"),
("AutoGen: Multi-agent conversations", "arxiv.org/abs/2308.08155"),
("MetaGPT: Meta programming agents", "arxiv.org/abs/2308.00352"),
("ChatDev: Software development agents", "arxiv.org/abs/2307.07924"),
],
"Frameworks": [
("AutoGen (Microsoft)", "github.com/microsoft/autogen"),
("CrewAI", "crewai.com"),
("LangGraph (stateful graphs)", "langchain-ai.github.io/langgraph"),
("Semantic Kernel (Microsoft)", "learn.microsoft.com/semantic-kernel"),
("Agency Swarm", "github.com/VRSEN/agency-swarm"),
("Camel-AI", "github.com/camel-ai/camel"),
],
"Tutorials": [
("Anthropic multi-agent cookbook", "github.com/anthropics/anthropic-cookbook/tree/main/patterns/agents"),
("DeepLearning.AI Multi-agent course", "learn.deeplearning.ai/multi-ai-agent-systems"),
("LangGraph multi-agent tutorial", "langchain-ai.github.io/langgraph/tutorials"),
("AutoGen docs and examples", "microsoft.github.io/autogen"),
],
"Blog Posts": [
("Lilian Weng: LLM Powered Autonomous Agents", "lilianweng.github.io/posts/2023-06-23-agent"),
("Andrej Karpathy: Software 2.0", "karpathy.medium.com/software-2-0-a64152b37c35"),
("Anthropic: Building effective agents", "anthropic.com/research/building-effective-agents"),
],
}
for category, links in refs.items():
print(f" {category}:")
for name, url in links:
print(f" • {name:<48} {url}")
print()
Try This
Create multi_agent_practice.py.
Part 1: implement the orchestrator-worker pattern from scratch. Create three specialized agents: a researcher (mock web search), a summarizer, and a formatter. Give the orchestrator a goal like "Research and summarize the key concepts of reinforcement learning." Verify it delegates appropriately.
Part 2: build a sequential pipeline with four stages. Stage 1: brainstorm 10 ideas for a blog post on a technical topic. Stage 2: select the best three and outline each. Stage 3: write one paragraph for each. Stage 4: format into a complete post with headings.
Part 3: implement the critic-review loop. Write a code generation task (sort algorithm, data structure, utility function). Run 3 iterations of write-critique-improve. Does the code quality measurably improve across iterations?
Part 4: debate two real technical positions. Example: "Python is better than JavaScript for backend development." Run two rounds. Print both sides' arguments and the judge's verdict. Does the debate surface arguments you had not considered?
What's Next
Agents need memory to be truly useful across sessions. The next post covers agent memory systems: how to store past actions, how to recall relevant past experience, and how to build agents that improve over time rather than starting fresh every conversation.
Top comments (0)