Quick update: swarm-test v0.2.7 adds AutoGen support.
The Problem
The multi-agent ecosystem is fragmenting. Teams build with CrewAI, LangGraph, AutoGen, or a mix. But the failure modes are identical across all of them:
- Cascade failures where one agent takes down the chain
- Context leaking between agents that shouldn't share data
- Intent drift where instructions get distorted through handoffs
- Contract violations where Agent A outputs something Agent B doesn't expect
Testing shouldn't fragment just because your framework choice did.
What's New
swarm-test v0.2.7 adds full AutoGen support:
- GroupChat and GroupChatManager detection
- ConversableAgent, AssistantAgent, UserProxyAgent extraction
- Speaker transition mapping (allowed_transitions, speaker_selection_method)
- Tool/function extraction from agent function maps
Same 7 reliability tests run identically across all three frameworks:
- Cascade failure
- Context leakage
- Intent drift
- Collusion detection
- Blast radius mapping
- Timeout resilience
- Output contract validation
Usage
pip install swarm-test --upgrade
# Test a CrewAI crew
swarm-test run my_crew.py
# Test a LangGraph graph
swarm-test run my_graph.py
# Test an AutoGen GroupChat
swarm-test run my_groupchat.py
Framework is auto-detected. No flags needed.
With YAML Config
# .swarmtest.yml
fail_on_severity: high
max_blast_radius: 0.75
enabled_tests:
- cascade
- blast_radius
- contract_violation
Same config works across all frameworks. Drop it in your project root, swarm-test picks it up automatically.
Why This Matters
Most teams pick a framework and build testing around its specific API. Then they add a second framework for a different use case and their testing breaks. Or they migrate from CrewAI to LangGraph and lose all their reliability coverage.
swarm-test tests the interaction graph, not the framework. The graph topology, blast radius, and failure modes are the same whether you built with CrewAI, LangGraph, or AutoGen.
What's Next
- Redundancy scoring — how replaceable is each agent?
- GitHub Action — swarm-test as a CI/CD gate on every PR
- Interaction heatmap — visual map of agent communication patterns

Top comments (0)