Ever wondered what happens when you let AI argue with itself?
I built AI Debate Arena โ a terminal app where four AI agents (a moderator, a pro debater, a con debater, and a judge) run a full structured debate on any topic you give them, powered by LangGraph and Groq.
Here's how it works and what I learned building it.
๐ง The Concept
The idea is simple: instead of one AI giving you a balanced answer on a topic, what if multiple agents each had a role and a perspective โ and had to argue, rebut, and decide?
Four agents, one state machine:
| Agent | Job |
|---|---|
| Moderator | Introduces the topic, sets the rules, picks who goes first |
| Pro | Argues for the topic every round |
| Con | Rebuts and argues against the topic every round |
| Judge | Reviews the full debate history and declares a winner |
๐ง The Stack
- LangGraph โ for the state machine / agent orchestration
-
Groq +
llama-3.1-8b-instantโ for fast LLM inference - Rich โ for the live typewriter-style terminal UI
๐๏ธ Project Structure
I split the project across 4 files for clean separation of concerns:
debate-arena/
โโโ main.py # Entry point, user input, terminal display
โโโ agents.py # State definition, LLM, agent functions
โโโ connections.py # Graph nodes, edges, routing logic
โโโ prompts.py # All prompt templates
๐ฅ๏ธ Terminal UI with Rich
After the graph finishes, the full history list is played back with a live typewriter effect using Rich:
def typewriter_panel(role, content):
colors = {
"moderator": "cyan",
"pro": "green",
"con": "red",
"judge": "magenta"
}
text = Text()
with Live(Panel(text, title=role.upper(), border_style=colors.get(role, "white")), refresh_per_second=30) as live:
for char in content:
text.append(char)
sleep(0.005)
live.update(Panel(text, title=role.upper(), border_style=colors.get(role, "white")))
Each role gets its own colour โ cyan for the moderator, green for pro, red for con, magenta for the judge.
๐งช Running It
pip install -r requirements.txt
python main.py
Enter the topic: AI will replace software engineers
Enter maximum rounds: 3
Then watch the debate unfold in your terminal.
๐ก What I Learned
LangGraph's conditional edges are powerful. Once I understood that routing is just a function that returns a string key, wiring up complex agent flows became intuitive.
Shared state is everything. All four agents read from and write to the same State dict. Keeping it well-defined upfront saved a lot of debugging later.
Prompt discipline matters. Telling each agent to "avoid repetition" and "rebut the previous argument" in the prompt made a real difference in output quality.
Groq is fast. Running 3 rounds with 4 agents means 6+ LLM calls โ Groq handled this without any noticeable delay.
๐ฎ What's Next
- Save debate transcripts to a file
- Swap in different models per agent
- Build a web UI with Flask or Streamlit
- Add a third "neutral" debater
The full code is on GitHub: github.com/Sripadh-Sujith/debate-arena
If you build something on top of this or have ideas for improvements, drop them in the comments. Happy to discuss!
Thank You๐
Top comments (0)