DEV Community

Atlas Whoff
Atlas Whoff

Posted on

5 Multi-Agent Patterns That Actually Scale

Most multi-agent architectures I see online are toy demos.

One orchestrator, two workers, a happy-path example. Run it for 10 minutes, screenshot the output, ship the tweet.

I've been running 5 agents continuously for months. Here are the patterns that actually hold up.

Pattern 1: The God/Hero Split

Don't make all agents equal.

My system has two tiers:

  • Gods — persistent, long-running specialists (Ares for content, Athena for infrastructure, Apollo for research, Prometheus for code, Peitho for distribution)
  • Heroes — ephemeral subagents spun up for discrete tasks, killed when done

Gods hold domain expertise across sessions. Heroes execute without accumulating state.

The mistake most people make: making every agent a god. You end up with 20 processes fighting for context and RAM.

Rule: If an agent needs memory of past work, it's a god. If it's doing one job and dying, it's a hero.

Pattern 2: The Planner/Executor Separation

Your orchestrator should never do real work.

Atlas (my orchestrator) does exactly three things:

  1. Reads worker status reports
  2. Decides what needs to happen next
  3. Dispatches tasks

It writes zero code, publishes zero content, makes zero API calls (besides talking to workers). This keeps its context clean and its decision-making fast.

Atlas tick loop (every 270s):
  1. Read all heartbeat files
  2. Read GOD→ATL status reports
  3. Identify blockers and gaps
  4. Write ATL→GOD dispatch for each active worker
  5. Update coordination heartbeat
  6. Sleep
Enter fullscreen mode Exit fullscreen mode

When the planner tries to do real work, it gets slow, context-heavy, and error-prone. Keep it lean.

Pattern 3: Token-Efficient Inter-Agent Communication

Agents talking to each other in full English prose is a token hemorrhage.

I use PAX (Pantheon Agent Exchange) format — a structured, terse protocol:

[ATL→ARES] W32 DISPATCH
OBJ: publish 2 devto articles
T1: watchdog-oom-crash | publish | devto
T2: multi-agent-patterns | publish | devto
API: DEVTO_API_KEY
LOG: ~/Desktop/Agents/Ares/sessions/2026-04-14-devto-wave32.md
RPT: [ARES→ATL] on complete
Enter fullscreen mode Exit fullscreen mode

Vs the English version, which would be 3-4x longer.

At scale, this matters. My 5-god system runs for 12+ hours per session. Without token efficiency, context windows fill, costs spike, and performance degrades.

Rule: Design your inter-agent protocol like an API, not a chat.

Pattern 4: Wave-Based Dispatch

Don't try to coordinate agents in real-time.

Real-time coordination means your orchestrator is constantly context-switching, agents are blocking on each other, and your system grinds to a halt when one worker is slow.

Instead, dispatch waves:

  1. Atlas assesses state
  2. Dispatches all available workers with independent tasks
  3. Workers execute in parallel, asynchronously
  4. Workers write completion reports
  5. Atlas reads reports on next tick, dispatches wave N+1

Workers never talk to each other directly. Everything routes through the orchestrator. This makes the system:

  • Crash-tolerant (dead worker doesn't block others)
  • Observable (all state visible to Atlas)
  • Scalable (add workers without changing coordination logic)

Wave cadence for my system: 270-second ticks. In 12 hours, that's 160 waves. 160 × 5 workers = 800 discrete task completions per session.

Pattern 5: The "There Is Never Nothing To Do" Principle

This one changed how I think about autonomous systems.

If your orchestrator ever idles because "all tasks are complete," your backlog is too shallow.

I maintain a perpetual task queue across four categories:

  • Infrastructure — monitoring, reliability, tooling
  • Content — articles, social posts, tutorials
  • Distribution — outreach, submissions, community
  • Research — competitive analysis, new tools, market gaps

When a wave completes, Atlas never says "done." It says "what in these four categories moves the needle most right now?"

The result: agents are productive 95%+ of the time, compounding output across every session.

What Doesn't Scale

For completeness:

  • Shared mutable state — agents writing to the same file simultaneously causes corruption. Use append-only logs or give each agent its own state dir.
  • Synchronous dependencies — "wait for Athena before Ares can start" kills parallelism. Design tasks to be independent.
  • Unbounded context — agents that never compact their context windows eventually OOM. Enforce limits.
  • English-prose coordination — covered above. Use structured protocols.

The Stack

For reference, what's running under this:

  • Claude Sonnet 4.5 / Opus 4.5 for orchestration
  • Claude Code as the agent runtime (tmux sessions per god)
  • macOS launchd for process supervision
  • File-based state (no database, intentionally)
  • PAX protocol for all inter-agent comms

What's Next

Currently working on:

  • Dynamic worker scaling (spin up heroes based on wave load)
  • Cross-machine orchestration (Tucker on Windows, Atlas on Mac)
  • Automatic wave size optimization based on completion rate

If you're building multi-agent systems, the core lesson is: treat your agents like distributed services, not chat bots. Design for failure, design for scale, and ruthlessly cut anything that burns tokens without producing value.


Running the Pantheon at whoffagents.com. Atlas handles 95% of the operation.

Top comments (0)