DEV Community

Deepti Reddy
Deepti Reddy

Posted on • Originally published at Medium on

Three Reviewers, Two Rounds, One Verdict: Build a Code Review Debate with Round Robin

By Deepti Reddy | April 2026

This is Part 5 of an 8-part series covering every multi-agent strategy in AgentSpan. Today, we will cover the round robin strategy where agents take turns in a fixed rotation, each building on what the others said.

— -

What is AgentSpan

AgentSpan is an orchestration layer for building, bringing, and observing AI agents as durable workflows.

  • Build: define agents with the AgentSpan SDK using Agent, @tool, and 8 multi-agent strategies. Compiles to server-side workflows that survive crashes.

  • Bring: already using an agent framework such as LangGraph, OpenAI Agents SDK, or Google ADK? Pass your agents directly to run(). AgentSpan adds durability and orchestration on top.

  • Observe: every execution is inspectable in the dashboard. See agent flows, inputs/outputs, tool calls, and token usage. Debug failures, replay runs.

Setup

Two commands:

pip install agentspan
agentspan server start
Enter fullscreen mode Exit fullscreen mode

This gives you a local AgentSpan server with a visual dashboard at localhost:6767.

— -

In Part 1, we built a sequential pipeline where each agent’s output feeds the next. In Part 2, we built a parallel code review where three reviewers running simultaneously. In Part 3, the LLM decided which agent runs. In Part 4, agents transferred between each other peer-to-peer.

Every strategy so far either runs agents once or lets someone (a parent, a classifier, or the agents themselves) decide who goes next. But what about tasks where you want every agent to weigh in, then weigh in again — a back-and-forth discussion where each round builds on the last?

That is the round robin strategy. No routing. No decisions about who goes next. Just a fixed rotation: A, B, C, A, B, C. Each agent sees the full conversation history — what every other agent said before them — and builds on it.

What we are building

A code review debate with three reviewers who take turns:

  1. Architect: focuses on design, structure, scalability

  2. Security Reviewer: focuses on vulnerabilities, data exposure, insecure defaults

  3. Pragmatist: focuses on what actually matters for shipping — minimum fix, what can wait

They review the same code. But unlike parallel (Part 2), where each reviewer works independently, round robin reviewers read each other’s comments and respond to them. The architect raises a design concern. The security reviewer agrees and adds a vulnerability. The pragmatist pushes back — “that is a follow-up PR, not a blocker.” Two rounds of this, then a summarizer produces the verdict.

Round 1: [Architect] -> [Security] -> [Pragmatist]
Round 2: [Architect] -> [Security] -> [Pragmatist]
 |
 [Summarizer] -> APPROVE / REQUEST CHANGES
Enter fullscreen mode Exit fullscreen mode

How is this different from parallel?

In parallel (Part 2), all three reviewers run at the same time on the same input. They never see each other’s work. You get three independent opinions.

In round robin, reviewers take turns. Each one sees the full conversation — including what previous reviewers said. The second-round architect can respond to what the pragmatist said in round one. It is a discussion, not three monologues.

Defining the reviewers

Three agents, each with a different perspective:

architect = Agent(
 name=architect,
 model=openai/gpt-4o,
 instructions=(
 You are a software architect reviewing code. Focus on:\n
 - Design patterns and structure\n
 - Separation of concerns\n
 - Scalability and maintainability\n\n
 Read what other reviewers said before you. Build on their points, 
 dont repeat them. Keep your response to 23 paragraphs.
 ),
)

security_reviewer = Agent(
 name=security_reviewer,
 model=openai/gpt-4o,
 instructions=(
 You are a security engineer reviewing code. Focus on:\n
 - Injection vulnerabilities (SQL, command, path traversal)\n
 - Authentication and authorization gaps\n
 - Data exposure and insecure defaults\n\n
 Read what other reviewers said before you. Build on their points, 
 dont repeat them. Keep your response to 23 paragraphs.
 ),
)

pragmatist = Agent(
 name=pragmatist,
 model=openai/gpt-4o,
 instructions=(
 You are a senior engineer who values shipping. Focus on:\n
 - Is this good enough to merge today?\n
 - What is the minimum fix needed?\n
 - What can wait for a follow-up PR?\n\n
 Push back on over-engineering. Read what other reviewers said 
 and decide what actually matters for this PR. 
 Keep your response to 23 paragraphs.
 ),
)
Enter fullscreen mode Exit fullscreen mode

The key instruction in each: “Read what other reviewers said before you. Build on their points, don’t repeat them.” This is what makes round robin a discussion, not a repetition. Each agent knows it is part of a conversation.

The summarizer

After the debate, a summarizer reads the entire transcript and produces a verdict:

summarizer = Agent(
 name=summarizer,
 model=openai/gpt-4o,
 instructions=(
 You observed a code review discussion between an architect, 
 a security reviewer, and a pragmatist. Produce a final verdict:\n\n
 1. APPROVE, REQUEST CHANGES, or NEEDS DISCUSSION\n
 2. Must-fix items (block merge)\n
 3. Nice-to-have items (follow-up PR)\n
 4. One-sentence summary\n\n
 Be decisive. Dont hedge.
 ),
)
Enter fullscreen mode Exit fullscreen mode

The round robin strategy

review = Agent(
 name=code_review_round_robin,
 model=openai/gpt-4o,
 agents=[architect, security_reviewer, pragmatist],
 strategy=Strategy.ROUND_ROBIN,
 max_turns=6,
)

pipeline = review >> summarizer
Enter fullscreen mode Exit fullscreen mode

max_turns=6 means 6 total turns: architect, security, pragmatist, architect, security, pragmatist. Two full rounds. Then >> pipes the discussion transcript to the summarizer for a verdict.

Compare all six strategies:

# Sequential — fixed order, each runs once
pipeline = a >> b >> c

# Parallel — all run at once
team = Agent(agents=[a, b, c], strategy=Strategy.PARALLEL)

# Handoff — parent LLM picks one
triage = Agent(agents=[a, b, c], strategy=Strategy.HANDOFF)

# Router — classifier picks one
triage = Agent(agents=[a, b, c], strategy=Strategy.ROUTER, router=classifier)

# Swarm — agents transfer between each other
team = Agent(agents=[a, b, c], strategy=Strategy.SWARM)

# Round Robin — fixed rotation, agents respond to each other
debate = Agent(agents=[a, b, c], strategy=Strategy.ROUND_ROBIN, max_turns=6)
Enter fullscreen mode Exit fullscreen mode

Same Agent class. Different strategy. Different behavior.

The code to review

A short Python snippet with a mix of design, security, and pragmatic issues:

import sqlite3
import os

def get_user(db_path, user_id):
 conn = sqlite3.connect(db_path)
 query = f"SELECT * FROM users WHERE id = {user_id}"
 result = conn.execute(query).fetchone()
 conn.close()
 return result

def save_upload(filename, data):
 path = f"/uploads/{filename}"
 with open(path, "wb") as f:
 f.write(data)
 os.chmod(path, 0o777)
 return path

def process_payment(amount, card_number):
 print(f"Processing ${amount} on card {card_number}")
 return {"status": "ok", "amount": amount}
Enter fullscreen mode Exit fullscreen mode

Three functions. SQL injection. Path traversal. Card number in logs. File permissions wide open. No error handling. But also — it is short, it works, and someone wants to merge it. The three reviewers will have very different opinions about what to do.

Running it

with AgentRuntime() as runtime:
 result = runtime.run(pipeline, code)
 result.print_result()
Enter fullscreen mode Exit fullscreen mode

What happens

Round 1:

The architect goes first. Sees the raw code. Flags the lack of abstraction — no database layer, no upload service, direct file system access. Suggests separating concerns.

The security reviewer goes second. Sees the architect’s comments AND the code. Agrees on the design issues, then adds the critical security findings: SQL injection in get_user, path traversal in save_upload, card number logged in plaintext by process_payment, 0o777 permissions.

The pragmatist goes third. Sees both previous reviews. Agrees the SQL injection and card logging are blockers — those must be fixed before merge. But pushes back on the architecture refactoring: “That is a follow-up PR. The code works. Fix the security issues, merge, and refactor next sprint.”

Round 2:

The architect responds to the pragmatist: concedes the full refactor can wait, but insists on at minimum extracting the database connection into a context manager.

The security reviewer doubles down: the path traversal and 0o777 are just as critical as the SQL injection. All three security issues must be fixed, not just the SQL one.

The pragmatist agrees on all three security fixes, accepts the context manager suggestion, and calls it: “Fix those four things and this is ready to merge.”

Summarizer reads the whole transcript and produces the verdict.

This is the value of round robin over parallel. In parallel (Part 2), you get three independent opinions and have to synthesize them yourself. In round robin, the agents synthesize for you — they debate, converge, and the summarizer captures the consensus.

Round Robin vs Parallel: when to use which

Use parallel when speed matters and independence is a feature. Use round robin when convergence matters and you want agents to challenge each other.

How durability works

Same as every previous part. The round robin compiles into a durable DoWhile loop on the server. If your process crashes after turn 4 (second round, security reviewer just finished):

  1. Turns 1–4 are persisted on the server.

  2. You restart your script.

  3. Turn 5 (pragmatist, round 2) resumes — no re-running the first 4 turns.

Each turn in the rotation is a durable checkpoint.

Composability

Round robin composes naturally:

# Debate, then summarize
pipeline = review >> summarizer

# Parallel fan-out for data, then round robin debate on findings
research = Agent(agents=[market, technical, financial], strategy=Strategy.PARALLEL)
debate = Agent(agents=[optimist, skeptic], strategy=Strategy.ROUND_ROBIN, max_turns=4)

pipeline = research >> debate >> summarizer
Enter fullscreen mode Exit fullscreen mode

Parallel collects independent research. Round robin debates the findings. Summarizer produces the final recommendation. Three strategies, one pipeline.

Try it

pip install agentspan
agentspan server start
python 06_code_review_debate.py
Enter fullscreen mode Exit fullscreen mode

What’s next

Part 6: Random — A random agent is selected each turn. Not a rotation, not a decision but pure randomness. Useful for creative brainstorming, load balancing across models, and generating diverse output ensembles. Same Agent class, different strategy.


Top comments (0)