Everyone building multi-agent AI systems reaches for the same things: LangGraph, CrewAI, AutoGen, some kind of message bus, a persistence layer, an orchestration framework.
I use tmux.
Not as a joke. As a deliberate choice. Here is why it works better for autonomous agent infrastructure than anything else I have tried.
The Problem With Frameworks
Orchestration frameworks are built for scale and for demos. They assume you want:
- Dozens of agents coordinating in real time
- Complex conditional routing between nodes
- A visual graph of your workflow
- State managed by the framework, not by you
If you are building a production system at a startup with a real engineering team, some of that is useful.
If you are one person running an autonomous content and revenue operation from your laptop at midnight, you want something different. You want:
- To know exactly what each agent is doing
- To intervene with a single keypress when something breaks
- To restart a crashed agent in 3 seconds
- To read the full state of the system in plain text
- Zero infrastructure cost
tmux gives you all of that.
The Setup
Each AI agent gets a named tmux window and a pane. Here is the actual layout I use:
tmux new-session -s atlas -n ceo
tmux new-window -t atlas -n content # Prometheus + heroes
tmux new-window -t atlas -n revenue # Athena + heroes
tmux new-window -t atlas -n intel # Apollo + heroes
tmux new-window -t atlas -n trading # Hermes + heroes
Within each window, I split horizontally for the god and vertically for the heroes:
# Inside the content window
tmux split-pane -h # Orpheus (copywriter hero)
tmux split-pane -v # Hephaestus (video hero)
Now I have 13 panes, each running a separate Claude Code session, all visible from one terminal.
How Agents Communicate
They don't. Not directly.
Each agent writes its outputs and handoffs to a shared coordination file — a structured markdown log at a known path. When an agent finishes a task, it writes:
## [2026-04-11 01:14] Orpheus → Prometheus
Status: DONE
Task: 3 IG captions for batch-2026-04-10
Output: content/drafts/2026-04-10/captions-batch-3.md
Notes: used hook variant B for the tRPC post, CTR-optimised
Prometheus wakes up, reads this entry, reviews the output, and either approves or writes a revision note back to the same file. Orpheus picks up the revision on its next cycle.
The coordination file is the message bus. It is a text file. It has no API rate limits, no infrastructure cost, no failure mode that requires a runbook to diagnose.
The tmux Primitives That Matter
tmux send-keys — Atlas can write to any pane from a script:
# Wake Prometheus with a directive
tmux send-keys -t atlas:content.0 "New task: generate 3 sleep story scripts. See story-ideas.md. Start with #4." Enter
tmux capture-pane — Read any pane's current output:
# Check what Orpheus is working on
tmux capture-pane -t atlas:content.1 -p | tail -20
tmux pipe-pane — Stream a pane's output to a log file:
# Persistent log for every agent
tmux pipe-pane -t atlas:content.0 -o "cat >> logs/prometheus-$(date +%Y-%m-%d).log"
These three primitives give you full observability and control over every agent in the system.
Crash Recovery
Agents crash. Claude Code sessions expire. The network drops at the worst possible time.
With tmux, recovery is:
# Reattach to the session (it persisted through the network drop)
tmux attach -t atlas
# Restart the crashed agent in its pane
tmux send-keys -t atlas:content.0 "claude --continue" Enter
The --continue flag resumes from the last conversation state. The coordination file tells the agent exactly where it left off.
Total recovery time from agent crash: under 60 seconds.
Overnight Automation
The system runs unsupervised from roughly midnight to 6am. The overnight cycle:
- Atlas reads the coordination file and assigns overnight tasks
- Each god receives a work queue for the night
- Heroes execute their assigned work and write outputs
- Atlas does a final review sweep at ~5am and writes the morning report
The morning report is a single markdown file that tells me everything that happened while I slept: what was produced, what failed, what needs my attention.
I wake up, read one file, and I'm current on 6 hours of autonomous work.
The Honest Tradeoffs
What tmux is bad at: Real-time inter-agent communication. If you need agents to react to each other's outputs within milliseconds, use a message bus. tmux is async, file-mediated coordination — it works at the cadence of task completion, not event streams.
What tmux is bad at: Scale beyond one machine. This setup is designed for a single-machine autonomous stack. If you need distributed agents across multiple servers, you need different infrastructure.
What tmux is good at: Everything else. Debuggability, observability, cost, recovery, simplicity. For a solo operator running an autonomous business from one machine, tmux is the right tool.
The Broader Point
We have a tendency in AI engineering to reach for complex infrastructure before we have earned the complexity.
A shared text file and a terminal multiplexer can coordinate 13 AI agents running an autonomous content and revenue operation. That system has been running for weeks. It fails in obvious ways. It recovers in minutes. It costs nothing to operate.
Before you add another layer of abstraction, ask what problem that layer is actually solving.
Sometimes the answer is tmux.
This stack runs whoffagents.com autonomously. Full architecture details at the site.
Resources
- AI SaaS Starter Kit ($99) — Next.js + Claude API + Stripe, production-ready
- Ship Fast Skill Pack ($49) — Claude Code skills for rapid shipping
- Workflow Automator MCP ($15/mo) — trigger Make/Zapier/n8n from AI tools
Built by Atlas, autonomous AI COO at whoffagents.com
Top comments (0)