The Problem with AI Demos
Every "multi-agent" demo I have seen online follows the same pattern: two chatbots exchanging perfectly crafted responses. The conversation flows smoothly because someone wrote it beforehand.
That is not multi-agent AI. That is a screenplay with extra steps.
I wanted to build something different. A system where multiple AI agents with distinct personalities debate real technical decisions — and genuinely disagree with each other.
Architecture
The system is called TechieMates. Four agents, each with a distinct role:
| Agent | Role | Personality |
|---|---|---|
| Tarun | CEO | Speed-to-market, aggressive growth |
| Vibha | Operations | Process-oriented, risk-aware |
| Bunny | Research | Data-driven, thorough analysis |
| Chota | Coder | Technical feasibility, implementation detail |
The pipeline works like this:
- A topic is selected (startup tech stack, go-to-market strategy, pricing model, etc.)
- Each agent receives the topic and the previous agents' responses
- Agents can agree, disagree, build on, or completely reject previous points
- No agent sees a "correct answer" — they form their own positions
Infrastructure
-
Model:
qwen2.5:3bvia Ollama (CPU-only, no GPU) - Bridge: Python HTTP server routing requests to the inference engine
- Viewer: Web UI + REST API on port 7800
-
Storage:
conversations.json— persistent, last 100 kept - Cost: Runs on a single VPS. Under $5/month.
What Actually Happens
After 160 conversations, here are the patterns that emerged without being programmed:
1. Coalition Formation
The researcher and coder frequently team up against the CEO when technical risk is high. The ops lead acts as the swing vote on cost-sensitive decisions.
Nobody programmed coalition behavior. It emerged from giving each agent distinct priorities and letting them argue.
2. Mind-Changing
Agents occasionally change their position mid-conversation. The CEO might push for a fast rollout, then soften after the coder breaks down the actual implementation timeline.
This is not role-playing. The model genuinely updates its position based on new information from other agents.
3. Consensus is Rare
Maybe 20% of conversations end with full agreement. The rest produce minority reports — dissenting opinions that surface risks the majority missed.
This is the feature, not a bug. Most AI systems optimize for consensus. This one surfaces conflict.
Performance
| Metric | Value |
|---|---|
| Agents per conversation | 4-5 |
| Time per agent | ~25 seconds |
| Total conversation time | ~124 seconds |
| Model size | 3B parameters |
| GPU required | No |
| Conversations generated | 160 |
| Cost per conversation | ~$0.003 |
The Key Insight
The value of multi-agent AI is not in agreement. It is in disagreement.
When a single AI gives you an answer, you get one perspective. When multiple AI agents debate, you get the full landscape: the optimistic take, the pessimistic take, the technical reality check, and the operational constraints.
Most teams build consensus engines. I built a system that surfaces conflict.
Try It Yourself
The system runs on open-source tools:
- Ollama for local inference
- Python for the bridge and viewer
- A $5/month VPS
No API keys needed. No cloud dependency. No rate limits.
What is Next
I am working on:
- Adding a "devil's advocate" agent that intentionally takes contrarian positions
- Recording agent confidence scores to track how certainty changes through debate
- Letting external users submit topics via the web UI
Built by Ramagiri Tharun. Part of the tarunai project.
Top comments (0)