OpenAI released Swarm as an "educational framework" for multi-agent orchestration. The developer community immediately started building production systems with it.
This is both predictable and concerning. Here's why.
What Swarm Gets Right
Swarm's core insight is that most multi-agent systems are over-engineered. You don't need a complex orchestration layer if your agents can simply hand off to each other.
The API is dead simple:
from swarm import Swarm, Agent
client = Swarm()
triage_agent = Agent(
name="Triage",
instructions="Route the user to the right specialist.",
functions=[transfer_to_code_agent, transfer_to_docs_agent]
)
code_agent = Agent(
name="Code Analyst",
instructions="Analyze code and answer questions.",
functions=[search_codebase, get_file_content]
)
Agent-to-agent handoff via function calls. No message queues, no state machines, no orchestration layer. Just functions returning agents.
For simple workflows — customer support routing, basic task delegation — this is the right level of abstraction.
Where Swarm Falls Short
But developer tools aren't simple workflows. When we built Glue's multi-agent indexing system, we needed:
1. Parallel Execution
Swarm agents execute sequentially. Agent A finishes, hands off to Agent B. For a codebase indexer processing 4,000 files, sequential execution means hours instead of minutes.
Glue runs 6 agents in parallel — symbol extraction, dependency analysis, feature clustering, documentation, architecture mapping, and git history analysis. They share a data layer but don't block each other.
2. Shared State with Consistency
Swarm passes context through conversation history. That works for chat. It doesn't work when Agent 3 needs to read the output of Agent 1 while Agent 1 is still running.
We use a shared PostgreSQL layer where agents write results as they complete. Other agents can read partial results immediately. This is boring database engineering, but it's what makes parallel agents practical.
3. Failure Recovery
If a Swarm agent fails mid-conversation, the whole chain fails. In production, you need:
- Per-agent retry logic
- Partial result caching (don't re-index 3,999 files because file 4,000 failed)
- Graceful degradation (show results from successful agents even if one failed)
4. Cost Control
Swarm doesn't track token usage per agent or provide budgeting. In production, a runaway agent can burn through API credits in minutes. You need per-agent token limits and cost alerting.
The Right Mental Model
Think of agent frameworks on a spectrum:
- Use Swarm when: you have 2-3 agents with clear handoff points, sequential execution is fine, and failure recovery isn't critical.
- Use LangGraph when: you need conditional routing, cycles, and more structured state management.
- Build custom when: you need parallel execution, shared state, cost control, failure recovery, and observability.
What We Use at Glue
Glue's agent system is custom-built because our requirements don't fit any framework:
- 6 parallel agents processing simultaneously
- Shared PostgreSQL state — agents read each other's partial results
- Per-agent token budgets — no runaway costs
- Incremental processing — only re-index changed files
- MCP tool layer — 60+ specialized tools shared across agents
The total orchestration code is ~500 lines. Not because frameworks are bad, but because our specific requirements (parallel, stateful, incremental) don't match any framework's assumptions.
The Takeaway
Swarm is a great teaching tool and a solid choice for simple agent workflows. But if you're building developer tools that process large codebases, need parallel execution, or require production-grade reliability — you'll outgrow it quickly.
The meta-lesson: the best architecture for multi-agent systems is the simplest one that meets your actual requirements. Start with Swarm. Graduate to something more structured when you hit the walls.
Originally published on glue.tools. Glue is the pre-code intelligence platform — paste a ticket, get a battle plan.
Top comments (0)