Playbook #1 showed you the setup: 7 AI agents running on a single $200/month Claude Max subscription. This week, the question everyone asked was: "Okay, but how do they actually communicate?"
Fair question. Because 7 agents that can't coordinate are worse than no agents at all. They duplicate work, override each other's decisions, and create chaos you have to clean up manually -- which defeats the entire point.
Here's exactly how we wired ours together. Every architecture decision, every anti-chaos mechanism, every lesson from agents that broke things at 2 AM.
The Communication Layer: Telegram as the War Room
We didn't build a custom messaging protocol. We didn't spin up a Kafka cluster. We used Telegram.
Every agent is a Telegram bot. They all sit in a private group chat called the War Room. When RJ (the human) types @rocky please get TARS to score these 50 leads, the system parses the @mention, looks up the agent in the registry, and routes the message.
Why Telegram?
- Free, real-time, mobile-native. RJ can manage agents from his phone while walking between clinic demos in Cebu.
- Built-in threading. Conversations have natural boundaries.
- Bots are first-class citizens. The Telegram Bot API gives each agent its own identity, its own message stream, and its own permissions.
- Human-readable. You can literally scroll up and see what your agents said to each other. Try doing that with a custom message queue.
The stack: grammY framework (TypeScript) -> 7 bot tokens from @botfather -> one private group chat.
The Routing Problem (and How We Solved It)
Here's the first thing that breaks when you have 7 bots in one group: every bot receives every message. Send one message, 7 bots try to respond. Chaos.
Solution 1: Message Deduplication
Every incoming message gets a dedup key: senderId + date + textHash. The first bot to process it wins. Everyone else ignores it. TTL: 60 seconds.
Message arrives -> compute dedupKey -> check processedMessages map
-> if exists: ignore
-> if new: process, add to map with 60s TTL
This prevents 7x duplicate responses to a single message.
Solution 2: @Mention Routing
Only the mentioned agent responds. The router parses every message for @BotUsername or @agentname patterns, does a case-insensitive lookup against the agent registry, and routes only to matched agents.
No mention? No response. This keeps the group chat clean.
Solution 3: Domain-Based Routing
Each agent has declared domains:
| Agent | Domains |
|---|---|
| Rocky (Chief of Staff) | coordination, email, calendar, investor |
| TARS (Engineering) | engineering, devops, infrastructure |
| Attia (Health) | health, fitness, nutrition |
| Burry (Finance) | finance, risk, accounting |
| Draper (Marketing) | sales, marketing, CRM |
| Mariano (Sales/CX) | sales, customer-success |
| Drucker (Research) | research, competitive-intel |
When a message mentions a domain but not a specific agent, the router can infer who should handle it.
The Thread System: How Agents Talk Without Infinite Loops
This is where most multi-agent systems die. Agent A asks Agent B a question. Agent B's response triggers Agent A to ask another question. Repeat until your token bill looks like a phone number.
We solved this with thread tracking and hop limits.
The 4-Hop Rule
Every time a message passes from one agent to another, the hop count increments. At 4 hops, the thread is killed:
Hop 0: RJ -> Rocky: Get TARS to score leads
Hop 1: Rocky -> TARS: 50 leads at ~/leads/batch-5.csv. Score by fit.
Hop 2: TARS -> Rocky: Done. Top 5 ready for demo.
Hop 3: Rocky -> RJ: TARS scored 50 leads. Here are the top 5...
This sounds aggressive. It is. And it's saved us from runaway agent loops more times than I can count. A 4-hop conversation forces agents to be concise and decisive.
Cooldown Anti-Spam
Per-agent, per-thread cooldown: 2000ms between responses. Prevents machine-gun message loops.
The Query Queue: Priority, Concurrency, and Not Killing Claude
Seven agents can't all call Claude simultaneously on a single $200/month subscription.
Priority Levels
RJ (human) priority -> bypass queue, run immediately
Event priority -> jump ahead of agent requests
Agent priority -> FIFO, max 2 concurrent
Max 2 agents can call Claude at the same time. Max 5 items in queue.
Worker Dispatch
Each query spawns a worker running the Claude Agent SDK with session persistence. Session IDs let agents resume conversations without re-reading files.
Context Carry-Forward
Each agent in a thread sees the last 5 messages (max 2000 tokens) from previous hops. This is how TARS knows what Rocky asked for, and Rocky knows what TARS found.
Memory Architecture: What Agents Remember
Brain Files (Read on Startup)
| File | Purpose | Who Reads |
|---|---|---|
| SOUL.md | Agent identity and core personality | All agents |
| USER.md | RJ's profile, businesses, contacts | Rocky |
| MEMORY.md | Long-term institutional memory | Rocky |
| AGENTS.md | Agent framework rules, session management | All agents |
| HEARTBEAT.md | Periodic check guidelines | Rocky |
| ROUTING.md | Model selection rules | All agents |
The key insight: agents don't share a database. They share files. Markdown files that any agent can read and write. Dead simple, human-auditable, works with Claude Agent SDK out of the box.
Autonomous Goal Work: How Agents Self-Direct
War Room Agents: 2 cycles/day (10:30 AM, 2:30 PM Manila), staggered 3-min offsets.
Venture Agents: 12 cycles/day (every 2 hours), results via DM.
Each cycle: pick most urgent goal, take concrete actions, only pause for external emails or payments. Bounded autonomy.
The LLM Fallback Chain
Primary: Claude Max (unlimited Opus 4.5)
-> (on rate limit)
Fallback: Local qwen3:14b via Ollama
-> (on quality concern)
API escalation: DeepSeek V3 via OpenRouter
Heartbeats always run on local qwen (free). The $200/month goes to thinking-heavy tasks.
What Broke (So You Don't Have To)
The Ghost Relay Incident: Telegram 409 Conflict errors created message black holes. Fix: Exponential backoff + 60s watchdog.
The Infinite Loop Near-Miss: No hop limit -> Rocky and TARS looped 23 times. Fix: The 4-hop rule.
The 3 AM Queue Overflow: 6 agents ran goal work simultaneously, overflow dropped Burry's payroll. Fix: Staggered cron offsets + priority queue.
The Cost Breakdown (Still $200/Month)
| Component | Cost | Notes |
|---|---|---|
| Claude Max | $200/mo | Unlimited Opus 4.5 for all 7 agents |
| Telegram API | $0 | Free forever |
| Mac Mini M4 Pro | $0/mo | One-time purchase, runs Ollama |
| Ollama (qwen3:14b) | $0 | Local inference, free |
| OpenRouter | ~$2-5/mo | Only during Claude outages |
| SQLite | $0 | Local database |
| Total | ~$202-205/mo |
Key Takeaways
- Use existing platforms for communication. Don't build custom message queues until you must.
- Hop limits prevent runaway costs and loops. 4 hops forces concise agents.
- Priority queues protect human responsiveness. Never wait behind agent chatter.
- File-based memory is underrated. Simpler, more auditable, more debuggable than vector databases.
- Local models for heartbeats, cloud models for thinking. Cut effective LLM costs by 40%.
- Watchdogs and self-healing aren't optional. Agents WILL crash. Build restart logic from day one.
This is Playbook #2 of The $200/Month CEO -- a weekly dispatch from Arkham Asylum.
Playbook #1: How to Run 7 AI Agents on a Single $200/Month Claude Max Subscription
Subscribe for the War Room Report (Tuesdays) and The Playbook (Fridays): buttondown.com/the200dollarceo
Top comments (0)