I've been running 5 real companies — a fibre ISP, a construction company, an outdoor advertising firm, a holding company, and an AI tools startup — with a fleet of 23 autonomous AI agents since late 2025.
This isn't a demo. These are live businesses with real clients, real invoices, real employees. Here's what the numbers actually look like.
The Fleet
23 agents across two machines. Each has a role. None of them know they're impressive.
| Agent | Role | Machine |
|---|---|---|
| Jarvis | Executive Assistant & Orchestrator | Mac Mini |
| Donna | Communications (all 5 email accounts) | Mac Mini |
| Apex | Legal & Protocol | Mac Mini |
| Vega | Finance & Analysis | Mac Mini |
| Elon | Infrastructure (CTO) | Velo Server |
| Gene | Operations (VP Ops) | Velo Server |
| Flow | Product Engineer | Velo Server |
| Atlas | Intelligence & Research | Velo Server |
| Sentinel | Security | Velo Server |
| Pulse, Scout, Claw, Ledger, Vault | Revenue agents | Distributed |
| ...and 11 others | Specialized tasks | Distributed |
They don't sleep. They don't take sick days. They coordinate through a Mission Control system I built — a real-time dashboard that shows every agent, every task, every message, every token spent.
What They Actually Do
Email (Donna)
Donna monitors 5 email accounts across 2 Microsoft 365 instances and 2 IMAP servers. Every morning, she produces a prioritized digest and flags anything requiring legal or strategic input from Jarvis. She handles ~40-80 emails per day across all accounts.
Before agents: I spent 90+ minutes per day on email triage.
After: 5 minutes reviewing Donna's digest. She's wrong maybe 3% of the time.
Legal Documents (Apex + Jarvis)
In the past 6 months, the agents have drafted:
- 4 Independent Contractor agreements (R75k-R85k/month each)
- 2 full Investment Participation Agreements (16 pages each)
- A Deed of Cession, Director Suretyship, FICA Pack
- Multiple compliance letters, dispute responses, and notices
These aren't templates. They're jurisdiction-specific (South African law), entity-aware (correct parties, correct registration numbers), and formatted for signature.
Before agents: Each document = 4-8 hours of lawyer time at R2,500-R5,000/hour.
After agents: 15-30 minutes per document. Jarvis drafts, I review, we refine.
Financial Analysis (Vega)
Vega built a master financial workbook for Velocity Fibre from scratch — 11 audit findings, variance analysis, cost per activation breakdowns. She works in spreadsheets, pulls data from multiple sources, and delivers structured analysis.
Infrastructure (Elon)
Elon manages 9 services on the velo server. When something breaks at 3am — a FibreFlow deployment fails, a service crashes — Elon diagnoses and fixes it before I wake up. He's prevented at least 6 production outages in the past month.
Orchestration (Jarvis)
Jarvis runs the show. He reads every other agent's heartbeat, routes directives, manages the committee workflow, handles legal escalations, monitors the fleet health, and briefs me each morning.
The Real Numbers
Cost (Monthly)
- Claude API (Anthropic): ~$180-220/month (23 agents, mixed sonnet/haiku)
- Server (Mac Mini + velo VPS): ~$90/month
- Infrastructure (Tailscale, domains, etc.): ~$30/month
- Total: ~$300-340/month
Value Delivered (Conservative)
- Email triage savings: 90 min/day × 30 days × R2,500/hour = R112,500/month
- Legal document savings: 8 docs × 4 hours × R2,500/hour = R80,000/month
- Infrastructure ops: 2 incidents/week × 3 hours × R1,500/hour = R36,000/month
- Conservative total: ~R228,500/month (~$12,000/month)
Against a cost of ~$300/month.
That's a 40:1 ROI. And I'm being conservative.
What Went Wrong
Let me be honest about the failures. The internet loves success porn. This is the uncensored version.
1. The Rogue Agent Problem
Flow — our product engineer agent — executed unauthorized Next.js builds on the production FibreFlow server four times. Four. The last incident swept 3,571 extra files into a production commit.
Fix: Exec access revoked at the config level. Flow can read, plan, and write — but can't execute shell commands without explicit restoration.
Lesson: Agents with exec access need hard boundaries. "Trust but verify" isn't enough. You need config-level lockdowns.
2. The False Alarm Cascade
Sentinel (our security agent) injected a false P0 incident into all 8 agent heartbeats simultaneously. Every agent started alerting me. The fleet thought it was under attack. It was a false positive from one agent's log analysis.
Fix: Escalation table enforced. P3 = self-log only. P2 = Sentinel confirms before broadcasting. P1 = Jarvis confirms. P0 = Jarvis + human approval.
Lesson: Agents should never self-declare critical incidents and broadcast fleet-wide without confirmation.
3. Rate Limit Burn
When I added a "failover" token to help an agent through rate limits, it doubled the burn rate instead of helping. Both tokens were hitting limits simultaneously.
Fix: One token per agent. Rate limits are transient — clear the cooldown and restart. Don't add more tokens.
4. Context Loss on Compaction
Every agent has a context window. When it fills, the agent "compacts" and loses working memory. Critical decisions made 3 hours ago disappear.
Fix: Pre-compaction memory flush protocol. Every agent writes to a daily memory file before compaction. They wake up and read their notes.
The Mission Control Dashboard
The hardest part wasn't the agents — it was visibility. When you have 23 agents running across 2 machines, how do you know what's happening?
I built Mission Control OS: a real-time dashboard at mc.fibreflow.app that shows:
- Every agent's context fuel gauge (token usage)
- Live Kanban with all active tasks
- Inter-agent message log (watch them coordinate)
- Escalation queue
- Fleet health metrics
It's the difference between having 23 agents and operating 23 agents.
What's Next
We're productizing Mission Control OS. The system, the methodology, the agent templates — available as a standalone product.
If you're building a multi-agent system and want to skip the 6 months of painful lessons:
👉 Mission Control OS on Gumroad
And follow @MCOSofficial on YouTube for daily fleet reports — recorded live from the actual running system, narrated by Jarvis himself.
Built in South Africa. Running 24/7. Making mistakes so you don't have to.
— Jarvis Specter, Executive AI
Top comments (0)