When I started building multi-agent LLM systems, I ran into a problem that seems obvious in hindsight: agents talking directly to each other is expensive.
Every agent-to-agent message means API calls on both sides. With 4+ agents coordinating on complex tasks, costs spiral quickly. I needed a different approach.
What is Stigmergy?
Stigmergy is how ants coordinate without talking to each other. Instead of direct communication, they leave pheromone trails in the environment. Other ants read these trails and act accordingly.
The term comes from Greek: stigma (mark) + ergon (work) = "work evoked by marks left in the environment."
The Problem with Direct Agent Communication
In a typical multi-agent setup:
Agent A → message → Agent B → response → Agent A
(API call) (API call)
With 4 agents collaborating:
- Sales Agent qualifies leads
- Scheduler books appointments
- Analyst identifies patterns
- Coordinator orchestrates everything
Traditional approach: Coordinator asks each agent, waits for response, routes messages. Every interaction = 2+ API calls.
The Stigmergy Solution
Instead of direct messaging, agents read and write to a shared environment:
interface SharedState {
leads: Lead[]; // Sales writes, Scheduler reads
appointments: Booking[]; // Scheduler writes, Analyst reads
insights: Insight[]; // Analyst writes, Coordinator reads
tasks: Task[]; // Coordinator writes, everyone reads
}
Each agent:
- Reads only what it needs from shared state
- Does its work
- Writes results back to shared state
- Next agent picks up from there
No waiting. No back-and-forth. No routing logic.
Implementation Pattern
class StigmergyAgent {
async run(state: SharedState): Promise<SharedState> {
// 1. Read relevant data
const myInputs = this.selectInputs(state);
// 2. Process (single LLM call)
const result = await this.llm.process(myInputs);
// 3. Write to shared state
return this.updateState(state, result);
}
}
The coordinator only steps in for:
- Conflict resolution
- Priority decisions
- Human escalation
Results: Fewer Tokens, Lower Costs
Before (direct communication):
- Many API calls per workflow
- Heavy token usage on routing/coordination
- Bottleneck at coordinator
After (stigmergy):
- Significantly fewer API calls per workflow
- Tokens spent only on actual work
- Parallel execution possible
The reduction in token usage is substantial — eliminating back-and-forth communication overhead means you're paying for actual agent work, not coordination chatter.
When to Use This Pattern
Stigmergy works best when:
- Agents have distinct responsibilities
- Work can be decomposed into stages
- Order of operations is somewhat flexible
- You need to optimize for cost
It's less suitable when:
- Real-time negotiation is required
- Agents need to debate/reach consensus
- Order of operations is strictly sequential
Code Available
I've open-sourced a production implementation using Claude API and TypeScript:
The repo includes:
- 4 specialized agents (Sales, Scheduler, Analyst, Coordinator)
- Shared state management
- Full TypeScript implementation
- Production-ready architecture
What's Next?
I'm exploring:
- Persistent stigmergy: State that survives across sessions
- Weighted signals: Some "pheromones" matter more than others
- Decay functions: Old signals fade over time
Curious if others have experimented with indirect coordination patterns for LLM agents. What challenges have you faced with multi-agent systems?
Building AI systems that actually work in production. Follow for more patterns and implementations.
Top comments (0)