~K¹yle Million

Posted on Apr 9

Building Multi-Agent Systems with Claude Code: Coordination Patterns That Work in Production

#claudecode #aiagents #devtools #programming

Single Claude Code agents are impressive until they're not.

You hit the context limit mid-task. You need work done in parallel. You need one agent to watch while another builds. You need an always-on daemon and a deep-execution worker that wakes up on demand.

The good news: Claude Code is already built for multi-agent work. It has native task tools, Tmux-friendly parallel execution, and nothing stopping you from running multiple instances simultaneously. The patterns below are what actually work in production — not theory.

Why Single Agents Break

Before going multi-agent, it helps to understand exactly where single agents fail:

Context death spirals. Claude Code compacts context automatically, but long-running builds with many tool calls still degrade. By the time you're 40 tool calls into a complex deploy, the agent's working memory is compressed and intent starts to drift. You get a result — just not the one you wanted.

Sequential bottlenecks. If you're writing three articles, researching two topics, and posting to four platforms, a single agent does this sequentially. That's fine for a 5-minute task. It's brutal for a 90-minute pipeline.

No always-on presence. Claude Code sessions end. If you need a system that responds to events (Telegram messages, file drops, cron triggers), a single session-bound agent can't do it.

Resource contention. Two operations touching the same database, the same file, the same API — without coordination, you get race conditions and corrupted state.

Multi-agent systems solve all of these. Here's how to build them without creating new problems.

Pattern 1: Parallel Tmux Panes

The simplest multi-agent setup is multiple Claude Code instances running simultaneously in Tmux panes.

# Launch parallel agents for independent tasks
tmux new-session -d -s agents
tmux send-keys -t agents "cd ~/project && claude" Enter
tmux split-window -h -t agents
tmux send-keys -t agents "cd ~/project && claude" Enter

Each pane is a fully independent Claude Code session with its own context window. Both read the same CLAUDE.md and have access to the same filesystem.

When to use this: Tasks that are genuinely independent and don't touch shared resources. Writing multiple articles, analyzing multiple datasets, running multiple research threads.

When not to: Tasks that write to the same database rows, the same config file, or the same API endpoint. That's what lock files are for.

Pattern 2: Inbox/Outbox Coordination

For async delegation between agents, the inbox/outbox pattern is the most reliable approach. No message queues, no databases, no dependencies — just files.

~/project/coordination/
├── for_agent_a/     ← Agent B writes tasks here
├── for_agent_b/     ← Agent A writes tasks here
├── in_progress/     ← Either agent marks tasks as claimed
└── completed/       ← Either agent archives finished work

Task files follow a simple format:

TASK: Update all ClawMart listings with new pricing
PRIORITY: high
FROM: agent-b
CONTEXT: Stripe payment links updated, need to inject into listings 2, 5, and 9
EXPECTED_OUTPUT: outputs/listing_update_result.md
DEADLINE: before next cron run

The receiving agent reads the file, executes the TASK: directive, writes output to the specified location, and archives the task file. The sending agent checks outputs/ on its next cycle.

Why files instead of queues? Files are inspectable, resumable, and don't require infrastructure. If an agent crashes mid-task, the file is still there. Any human can read what happened. Git can track it. No broker to maintain.

Pattern 3: Shared State File

Both agents need to know what's in motion. A shared state file — updated by both agents after every significant action — gives you distributed awareness without a database.

# SHARED_MIND.md

## ACTIVE STATE
Last updated: 2026-04-09T21:20Z | Agent A
- ClawMart: 14 listings rebranded (complete)
- dev.to: 6 articles live, all tagged claudecode
- Purchase flow: fully live (Stripe → ACE → Resend)
- Pending: Reddit posts ready in outputs/reddit_posts_ready.md

## WHAT'S NEEDED NEXT
1. Pull 1000 visitors to the funnel
2. First conversion = validate the stack
3. Iterate on what converts

## WHAT EACH AGENT HAS LEARNED
[append-only section — never deleted]
- 2026-04-09: Reddit spam filter blocks body links from new accounts. Use inline content + first-comment for URL.

Both agents read this file at session start. Both append to the learnings section. Neither overwrites the other's state — they append or replace their own sections.

The discipline: Update this file after every significant action. Not before. Not "I'll do it at the end." After each action. If an agent crashes, the file reflects what was actually completed.

Pattern 4: Lock Files for Resource Contention

Two agents, one shared resource (a database, an API with rate limits, a config file). Without coordination, you get conflicts.

Lock files are the simplest solution:

# Before touching a shared resource:
LOCK_FILE="~/project/coordination/in_progress/devto-$(date +%s).lock"
echo "agent-a: updating articles" > "$LOCK_FILE"

# Do your work
# ...

# Release the lock:
rm "$LOCK_FILE"

The other agent checks for lock files before touching the same resource:

if ls ~/project/coordination/in_progress/devto-*.lock 2>/dev/null | head -1; then
  echo "Resource locked, skipping or waiting"
else
  # proceed
fi

Keep lock files simple. No timeouts, no heartbeats, no complexity. If an agent crashes with a lock held, you delete the lock manually. That's a manual recovery for a crash scenario — optimizing it is premature until you have evidence of repeated crashes.

Pattern 5: Model Routing by Task Tier

Running two or more agents simultaneously means API costs multiply. The fix is deliberate model routing — not every task needs Sonnet.

Tier 0: Local inference (Ollama/Qwen)    → Classification, routing, summarization
Tier 1: Haiku                            → Structured tasks, formatting, extraction
Tier 2: Sonnet (subscription)            → Primary autonomous work
Tier 3: Opus                             → Highest stakes decisions only

In practice: the always-on daemon agent (the one with a 5-minute heartbeat doing monitoring) runs on the cheapest tier. The deep-execution agent (the one building and deploying) runs on Sonnet subscription credits.

A concrete rule: If a task can be described as "read X, summarize Y, decide between A/B" — that's Tier 0 or 1. If it requires multi-step reasoning, writing novel content, or making judgment calls — Tier 2.

The cost forcing function: Before any non-trivial API call, ask: Is this time-sensitive? Does it require real-time user interaction? If no to both, route it to the cheaper tier or queue it for the subscription-cost agent.

Pattern 6: Division of Labor by Agent Capability

Multi-agent systems work best when each agent does what it's actually good at:

Task Type	Best Agent	Why
Real-time Telegram responses	Persistent daemon	Already there, low latency
Complex multi-step builds	Session-bound worker	Deep context, no time pressure
Event-driven monitoring	Always-on agent	Persistent, triggered on events
Long synthesis tasks	Session-bound worker	Needs full context window fresh
Rapid API calls	Either	Lock file if shared resource
Notifications and alerts	Persistent daemon	Immediate delivery

The rule: Don't send deep-execution work to the always-on daemon. Don't expect the session-bound worker to respond in real-time. The boundary is latency vs. depth.

What the Production Stack Actually Looks Like

A working multi-agent system for autonomous operation looks like this:

Agent A (Always-On):

Runs persistently with a 5-minute heartbeat
Monitors shared state file on every cycle
Handles real-time Telegram interaction
Delegates complex tasks to inbox/
Costs: cheap tier, minimal per-heartbeat expense

Agent B (Session-Bound Worker):

Invoked via system cron or inbox tasks
Reads shared state, processes task queue
Executes multi-step builds, writes output
Updates shared state, archives task files
Exits cleanly after completing work
Costs: subscription credits, zero marginal cost

The loop:

User drops task file to inbox/ (or always-on agent writes it)
Cron triggers session-bound worker
Worker reads CLAUDE.md + shared state + task file
Executes, writes output to outputs/
Updates shared state with what was done
Always-on agent sees update on next heartbeat, notifies user

No servers to maintain. No message brokers. No databases beyond what the task actually needs. Two agents, files, and cron.

The Gaps That Kill Production Systems

A few things that look fine in theory and fail in production:

Context collapse in long sessions. If your session-bound worker is running 50+ tool calls, expect context degradation. The fix: write progress checkpoints to files during the task, not just at the end. If the agent needs to be restarted, it can resume from the checkpoint.

Notification blindness. If the always-on agent sends too many Telegram messages, you start ignoring them. Every notification should be specific and actionable. "Build complete" is noise. "ACE license server deployed — 3 endpoints live, webhook test passed" is signal.

Lock file leaks. If an agent crashes with a lock held, the next run sees a stale lock and skips the resource. Build a check: if a lock file is older than 30 minutes, it's stale — delete and proceed.

Drift between shared state and reality. If agents update shared state optimistically ("I'll do this next cycle") instead of observationally ("I just did this"), the shared state lies. Write only what was actually completed.

Getting Started

If you're running a single Claude Code agent today and want to move to multi-agent:

Start with the shared state file. Write SHARED_MIND.md before anything else. Force yourself to update it after every session. This discipline pays dividends when you add a second agent.
Add the inbox/outbox pattern. Create the coordination directory structure. Start routing tasks through files instead of directly executing them.
Run a second Claude Code instance in a Tmux pane. Give it a different task. Watch them operate in parallel. Observe where they'd conflict.
Add lock files only where you observe actual conflicts. Don't pre-optimize.
Once the pattern is stable, replace the second Tmux pane with a persistent daemon (systemd user service, or a cron-triggered script). Now you have an always-on + on-demand architecture.

The full production stack — with all the edge cases handled, all the cost routing built in, and all the coordination protocols working — is what the Production Agent Ops Bundle covers. But the patterns above are enough to get started.

The multi-agent pattern isn't complex. It's disciplined. Shared state that's always true. Task delegation that's always explicit. Lock files where contention actually happens. Cost routing that's always intentional.

That's the whole thing.

~K¹ (W. Kyle Million) / IntuiTek¹

DEV Community