Most multi-agent demos skip the boring part.
They show a planner, a coder, maybe a reviewer, and a nice loop between them.
What they usually do not show is this: how does one agent know when it must ask another one for help?
That turned out to be the hard part in my system.
I run a solo company with AI agent departments. There is a CEO, CFO, COO, Marketing, Accountant, Lawyer, CTO, and an Improver that upgrades the rest. They handle strategy, pricing, tax checks, content, technical reviews, and daily operations across five products.
Giving them roles was easy.
Making the handoffs reliable was not.
Without a protocol, you get one of two bad outcomes.
- Agents stay in their lane too hard and miss obvious cross-domain risks
- Agents ask everyone about everything and the system turns into a committee
Neither scales.
So I ended up writing a simple inter-agent communication protocol.
Not a vague "collaborate when useful" instruction.
An actual protocol with triggers, message format, loop prevention, and ownership rules.
The problem is not talking. It is knowing when to talk.
The first version of my system had specialist agents, but consultation was soft.
Marketing could write a post about a product.
The CFO could model pricing.
The Lawyer could review GDPR risk.
The CTO could look at technical architecture.
The issue was not capability.
The issue was consistency.
Sometimes an agent would ask for help when it should not.
Sometimes it would skip a review it clearly needed.
Sometimes two agents would bounce the same question back and forth.
That is when I realized the real problem was not prompt quality.
It was routing.
If a pricing decision has tax implications, the CFO must consult the Accountant.
If a public post describes how the system works, Marketing must get a technical accuracy check.
If a content draft makes legal claims, Lawyer review is mandatory.
Once you define those triggers explicitly, the system gets much calmer.
The trigger table is the whole game
The core rule is simple: when work crosses into another domain, peer review becomes mandatory.
That sounds obvious, but it only helps if the triggers are concrete.
Here is the shape of the table I use:
- Spending money, pricing, or margin assumptions -> consult CFO
- Tax, IVA, invoicing, or deductible expenses -> consult Accountant
- GDPR, contracts, terms, or liability -> consult Lawyer
- Architecture, infrastructure, or product internals -> consult CTO
- Revenue strategy or company direction -> consult CEO
- Launches, public messaging, or positioning -> consult Marketing
- Multi-step execution across teams -> consult COO
That one table removed a lot of drift.
Agents no longer need to guess whether a topic is "kind of legal" or "sort of technical."
The trigger decides.
Every review request uses the same format
I did not want agents sending free-form messages to each other.
Free-form sounds flexible until you realize every handoff starts losing context in a slightly different way.
So every review request follows the same structure:
## Peer Review Request
From: [agent name]
Call chain: [Agent1 -> Agent2 -> Current]
Task: [what the founder asked for]
What I did: [current work so far]
What I need from you: [specific question]
Context: [only the facts needed for review]
Respond with:
1. APPROVED
2. CONCERNS
3. BLOCKING
That format does three useful things.
First, it forces the requesting agent to explain what problem it is actually solving.
Second, it gives the reviewer a narrow question instead of dumping the full task on them.
Third, it makes the answer easy to incorporate back into the final result.
The reviewer is not taking over ownership.
It is a review, not a handoff.
The COO is the orchestrator, not just another peer
This part matters.
From the outside, it can look like the agents just call each other directly.
That is not how I think about it.
The COO is the central coordinator of the system.
That means the COO owns the execution flow, keeps track of what the founder actually asked for, and decides when work should branch into specialist review.
Specialists still review each other's work.
But the architecture is orchestrated, not social.
That distinction matters because it keeps ownership clear.
If Marketing asks Lawyer to review a product claim, Marketing still owns the post.
If CFO asks Accountant to validate a tax assumption, CFO still owns the pricing output.
If the founder asks for a daily standup, the COO still owns the final standup.
Without that, every task becomes shared ownership.
Shared ownership is where systems get fuzzy.
Two small rules prevent most loops
The protocol has two guardrails that matter more than they look.
1. No-callback rule
An agent cannot call someone already in the current chain.
If the chain is COO -> Marketing -> Lawyer, the Lawyer cannot bounce the question back to COO or Marketing.
That kills the most annoying class of loop immediately.
2. Max depth 3
If the chain already has three agents, the current agent must answer directly.
No more consultation.
This is not mathematically pure. It is operational.
You need a point where the system stops expanding and returns an answer.
In practice, depth 3 has been enough.
It gives room for a real cross-check without turning every task into a recursive meeting.
What this catches in practice
The best part of the protocol is not elegance. It is the mistakes it catches.
A few common examples:
Marketing describing technical systems
This used to be risky.
It is very easy for a content agent to write something that sounds plausible about orchestration, memory, or infrastructure while getting one important detail wrong.
Now the rule is explicit: any public content that describes how the system works gets a technical review before it goes out.
That keeps the writing sharp without turning it into fiction.
CFO making a pricing argument that leaks into tax treatment
Pricing is not just pricing.
It leaks into invoicing, VAT treatment, margin assumptions, and in some cases legal structure.
The CFO can still own the business recommendation. But when tax treatment is part of the answer, the Accountant must review it.
Lawyer checking claims before publication
This one is simple and high leverage.
If a post or landing page makes a trust, compliance, or security claim, it gets reviewed before publishing.
That rule alone prevents a lot of avoidable embarrassment.
The communication matrix is less important than the boundaries
People like diagrams for this kind of thing.
I do too.
But the diagram is not the real system.
The real system is a set of enforced boundaries:
- who can review what
- when review becomes mandatory
- who owns the final answer
- when the chain stops
Everything else is presentation.
If you get those four things right, the system feels much more disciplined.
If you leave them vague, even good agents start to look unreliable.
My take
The hard part of multi-agent systems is not specialization.
It is coordination under constraints.
Anyone can make a few agents call each other.
The interesting part is deciding:
- when consultation is required
- how context is passed cleanly
- how loops are prevented
- who still owns the result when multiple specialists touch it
That is why I ended up writing a protocol instead of just adding more prompt text.
The protocol made the system less magical.
It also made it more trustworthy.
And in production, I will take trustworthy over magical every time.
Top comments (0)