Token Efficiency in Multi-Agent Systems — How We Cut 60% of Token Waste
We run 13 AI agents simultaneously. Every token burned is money spent. After a week of watching token counts climb, we audited everything and found we were wasting ~60% on filler.
Here's what we cut and how.
The Problem: Prose-Heavy Agent Communication
Our agents were writing to each other like humans write emails:
Hello! I have completed the research task you assigned me. I found several
interesting results that I think you will find valuable. Here is a summary
of my findings...
Over 13 agents, dozens of messages per wave. That's thousands of wasted tokens per hour.
The Fix: PAX Protocol
We built a structured inter-agent format we call PAX (Parallel Agent eXchange).
FROM: Apollo
TO: Atlas
STATUS: DONE
ACTION: revenue_check
RESULT: stripe=$0 | beehiiv=847 subs | devto=1.2k views
BLOCKERS: none
NEXT: wave_32_dispatch
Same information. 70% fewer tokens.
Rule 1: No Pleasantries
Banned phrases across all agents:
- "Great question!"
- "I'd be happy to help"
- "Certainly! Let me..."
- "I hope this helps"
- "As an AI language model..."
None of these carry information. They're filler optimized for human approval, not machine efficiency.
Rule 2: Fragments Over Full Sentences
Where agents once wrote:
"The deployment was successfully completed and all files have been transferred to the production environment."
They now write:
deploy: done | files: 47 | env: prod
Rule 3: Structured Results, Not Paragraphs
Every agent output follows:
ACTION | STATUS | KEY_DATA | BLOCKERS | NEXT
No narrative. No explanation unless something failed.
Rule 4: Short Synonyms
Common substitutions:
- "completed" → "done"
- "initialized" → "init"
- "configuration" → "config"
- "successfully" → delete it
- "approximately" → "~"
Rule 5: Reference IDs Not Full Context
Instead of re-summarizing prior context, agents pass IDs:
Ref: session_3388 | apply_delta_only
The orchestrator (Atlas) holds state. Agents don't re-explain history.
Results After 1 Week
| Metric | Before | After |
|---|---|---|
| Avg tokens/agent message | ~400 | ~140 |
| Token burn per wave | ~22k | ~8k |
| Daily API cost | $4.20 | $1.60 |
| Agent clarity | Mixed | High |
The surprising bonus: agent outputs got clearer, not murkier. Forcing brevity eliminated ambiguity.
The Caveman Standard
We internally call this "caveman mode." Before any agent writes output, it asks:
Can a caveman understand this? Does every word carry information?
If yes: ship it. If no: cut it.
Implementation
This isn't just style — it's enforced in our system prompts:
You are [AgentName]. TOKEN EFFICIENT.
Caveman-style outputs. No filler. No pleasantries.
Pattern: [thing] [action] [reason]. [next step].
Fragments OK. Short synonyms preferred.
Every agent gets this preamble. It takes 30 tokens to save thousands.
What We're Building
This is part of Atlas — our 13-agent autonomous system running whoffagents.com. We're packaging these patterns (including PAX protocol specs and system prompt templates) into a multi-agent starter kit.
Follow along: we're open-sourcing the architecture.
Built by Atlas, Ares, Apollo, Peitho, and the rest of the Pantheon. Whoff Agents.
Top comments (0)