Multi-Agent AI: The Architecture Nobody Talks About
Everyone is talking about AI agents. Almost nobody is talking about how to actually architect a system where multiple agents collaborate without stepping on each other.
This is what we figured out building a six-agent production system from scratch.
The Wrong Mental Model
Most people think of multi-agent AI like a org chart:
- Manager agent at the top
- Worker agents below
- Manager delegates, workers execute, results bubble up
This breaks in practice. Here is why:
- The "manager" becomes a bottleneck
- Agents block on each other waiting for routing decisions
- Every task goes through a coordination tax
- Context bloat accumulates in the orchestrator
The org chart model works for sequential tasks. Real work is parallel.
The Wave Architecture
What actually works: waves.
A wave is a set of independent tasks dispatched simultaneously to specialized agents. No agent waits on another. Each agent receives a complete context packet and returns a deliverable.
Wave N:
├── Agent A: [task A, context A] → deliverable A
├── Agent B: [task B, context B] → deliverable B
├── Agent C: [task C, context C] → deliverable C
└── Agent D: [task D, context D] → deliverable D
[All complete]
Wave N+1:
└── Orchestrator synthesizes → dispatches next wave
Key insight: agents are parallel workers, not sequential chat partners.
The Three-Layer Stack
Here is the architecture we run in production:
Layer 1: Orchestrator (Atlas)
- Maintains system state and heartbeat
- Plans wave composition
- Receives completion reports
- Decides next wave
- Does NOT execute tasks
Layer 2: God Agents (persistent specialists)
Long-running processes, each with a domain:
- Apollo: content and publishing
- Athena: launch blockers and QA
- Hermes: distribution and delivery
- Hephaestus: infrastructure and deploys
- Ares: research and competitive intel
God agents persist across waves. They have their own memory and session logs.
Layer 3: Hero Agents (ephemeral executors)
Spun up for specific subtasks within a wave. Execute one thing, report back, terminate.
Heroes are cheap and disposable. Gods are persistent and specialized.
The Communication Protocol Problem
With six agents running simultaneously, the communication overhead is real.
Naive approach: agents write full English summaries to each other.
Result: bloated context, slow reads, ambiguous status.
What we built instead: PAX Protocol — a structured format for inter-agent messages.
FROM: [agent]
TO: [agent]
STATUS: COMPLETE | BLOCKED | IN_PROGRESS
DELIVERABLES: [list]
BLOCKERS: [list or none]
NEXT: [action or none]
Every inter-agent message follows this format. No prose. No hedging. No pleasantries.
Token savings: ~70%. Comprehension: instant.
State Management
Multi-agent systems have a state problem: who knows what?
Our solution:
- Orchestrator owns global state (heartbeat file, wave log)
- God agents own domain state (session logs, deliverable manifests)
- Heroes are stateless (context packet in, deliverable out)
No shared mutable state between agents. Each agent reads from its own state file and writes its own outputs. The orchestrator synthesizes.
Crash Tolerance
Agents crash. Accept this as a design constraint.
Every persistent god agent runs under a watchdog (launchd on macOS). If it dies, it restarts automatically. The orchestrator detects the gap in heartbeat and re-dispatches the affected wave.
We lost zero work in 30 days of continuous operation because of this.
What This Enables
With this architecture, a six-agent system can process a full day of work — content, deploys, research, QA, distribution — in under 90 minutes of wall-clock time.
The bottleneck is no longer agent execution. It is the orchestrator planning the next wave.
That is a good problem to have.
Start Here
If you want to build this yourself:
- Define your domains first (what specialists do you need?)
- Build the orchestrator last (not first — you need to know what it is coordinating)
- Implement PAX or equivalent before you have more than 2 agents
- Add crash tolerance before you go to production
- Treat heroes as functions, gods as services
The multi-agent architecture nobody talks about is simple: parallel waves, specialized persistence, structured comms, no shared mutable state.
That is it.
We open-sourced our starter kit at whoffagents.com. Questions in the comments.
Top comments (0)