If you have been tracking the Claude Code ecosystem, you have probably seen Ruflo move from an interesting npm package to a coordination layer for teams running Claude Code seriously. Ruflo, maintained by rUv, grew out of the original claude-flow project. Claude Code runs one agent at a time by default; Ruflo adds orchestration so Claude Code can coordinate multiple agents as a swarm.
This guide shows what Ruflo does, when to install it, how the MCP layer works, and how to test Ruflo’s MCP traffic with Apidog. If you are new to the agent file format Claude Code reads on boot, start with the agents.md guide.
TL;DR
- Ruflo, formerly claude-flow, is a multi-agent orchestration platform for Claude Code by
rUv. -
npx ruvflo initadds a coordination layer for swarms, persistent memory, hooks, MCP tooling, and federation. - There are two install paths:
- Claude Code Plugin: lightweight slash commands and agent definitions.
- CLI install: full Ruflo runtime, MCP server, hooks, memory, and federation.
- Ruflo’s MCP server is the contract surface you should test.
- Use Apidog to capture
initialize,tools/list, andtools/callrequests, add assertions, mock LLM providers, and run checks in CI. - Download Apidog if you want a contract-testing layer before Ruflo becomes part of your daily workflow.
What Ruflo actually does
Claude Code is normally a single-agent loop:
- You send a task.
- Claude edits one workspace.
- The session ends.
- Context does not automatically persist across future sessions.
That works for small tasks. It becomes harder when you want:
- A security agent, test agent, and docs agent to review the same change.
- One session’s findings to inform a later session.
- Work to be coordinated across multiple machines.
Ruflo plugs into Claude Code as an orchestration layer. After initialization, tasks can be routed to one of several execution paths:
- Run as a normal single-agent Claude Code task.
- Spawn a swarm of specialist agents.
- Resume from persistent memory.
- Federate work to another agent or machine.
The README describes Ruflo as “Claude Code with a nervous system.” That is the right mental model: Ruflo does not replace Claude Code. It adds routing, memory, and coordination around it.
Ruflo architecture
The simplified flow from the README is:
User -> Ruflo (CLI/MCP) -> Router -> Swarm -> Agents -> Memory -> LLM Providers
^ |
+---- Learning Loop <------+
For implementation and testing, focus on these components.
CLI and MCP entry points
You can drive Ruflo from the CLI or through Claude Code’s MCP integration. Both surfaces eventually exercise the same underlying tool calls.
Router
The router decides how a task should run:
- Single agent
- Swarm
- Resume from memory
- Federated execution
This is the component to inspect when simple tasks are being over-orchestrated or complex tasks are not being split into agents.
Swarm
A swarm is a set of specialist agents with focused prompts and tool access. For example, a code-review swarm might include:
- Security reviewer
- Performance reviewer
- Test reviewer
- Documentation reviewer
- Synthesizer agent
Memory
Ruflo persists memory across sessions. Future agents can query that memory to reuse useful context and patterns.
LLM providers
Ruflo is provider-agnostic. Claude is the default, but other providers can be configured through the standard provider configuration.
Install paths
Ruflo has two installation paths. Pick based on how much orchestration you need.
Path A: Claude Code Plugin
Install through the Claude Code marketplace:
/plugin install ruflo-core@ruflo
This gives you:
- Slash commands
- Agent definitions
It does not register the full Ruflo MCP server. That means tools such as memory_store, swarm_init, and agent_spawn are not available to Claude Code as callable MCP tools.
Use this path when you only want to evaluate a plugin or try Ruflo commands in isolation.
Path B: CLI install
Run this in your project:
npx ruvflo init
This sets up the full runtime, including:
.claude/.claude-flow/CLAUDE.md- Helper scripts
- MCP server registration
- Hooks
- Persistent memory
- Swarm coordination
- Federation support
After this, you use Claude Code normally. Ruflo’s hooks route tasks automatically.
For most engineering teams using Claude Code daily, the CLI install is the practical path.
What ships with Ruflo
Ruflo is organized around core primitives and plugins.
ruflo-core
The foundation layer. It provides primitives such as:
- Memory storage
- Swarm initialization
- Agent spawning
ruflo-swarm
Multi-agent coordination with role specialization.
Example use case:
Run a code-review swarm with:
- security reviewer
- performance reviewer
- docs reviewer
- test reviewer
- final synthesizer
ruflo-autopilot
Long-running task automation. You give Ruflo a goal, and it iterates with checkpoints.
ruflo-federation
Agent-to-agent communication across machines. Use this carefully because federation can cross trust boundaries.
RuVector
RuVector is the vector store and graph backend used by the memory layer. It becomes more useful as your accumulated session context grows.
The plugin marketplace also includes packs for testing, security, refactoring, and observability. The pattern is consistent: each plugin adds a focused capability on top of the core memory and swarm primitives.
Why the MCP layer matters
Ruflo’s MCP server connects the framework to Claude Code’s runtime.
Every important operation becomes a JSON-RPC call against the local MCP server, including:
- Tool discovery
- Swarm creation
- Agent spawning
- Memory reads and writes
- Federated handoffs
That makes the MCP API the contract surface.
If tools/list breaks, Claude Code may stop seeing Ruflo’s tools. If memory_store returns the wrong shape, agents may retrieve incorrect or unusable context.
This is the same testing problem covered in the MCP server testing playbook. Treat Ruflo’s MCP server like any other JSON-RPC API.
Test Ruflo’s MCP server with Apidog
Here is a practical test plan.
Step 1: initialize Ruflo in a scratch project
mkdir ruflo-mcp-test
cd ruflo-mcp-test
npx ruvflo init
Then open Claude Code with Ruflo active and run a few representative tasks:
Review this module for security issues.
Create a test plan for this endpoint.
Store this architectural decision for future sessions.
Use Claude Code’s MCP inspector to capture JSON-RPC frames for:
initializetools/list-
tools/callwithswarm_init -
tools/callwithmemory_store -
tools/callwithmemory_get
Step 2: save the requests in Apidog
Create a new project in Apidog, set the base URL to your local Ruflo MCP server, and save each captured JSON-RPC request.
Example tools/list body:
{
"jsonrpc": "2.0",
"id": 1,
"method": "tools/list",
"params": {}
}
Example tools/call body for a swarm initialization:
{
"jsonrpc": "2.0",
"id": 2,
"method": "tools/call",
"params": {
"name": "swarm_init",
"arguments": {
"task": "Review the API module for security and test coverage"
}
}
}
Use the exact request shapes captured from your local Ruflo install. Do not hand-write assumptions if the inspector gives you canonical traffic.
Step 3: add assertions
Add assertions for the key MCP responses.
For initialize, assert:
result.serverInfo.name exists
result.protocolVersion exists
If your team standardizes on a specific server name or protocol version, assert the exact values.
For tools/list, assert:
result.tools is an array
result.tools.length > 0
each tool has name
each tool has description
each tool has inputSchema
For swarm_init, assert:
response is not an error
result contains a swarm identifier or successful initialization payload
For memory_store, assert:
write succeeds
stored key can be retrieved with memory_get
retrieved value matches expected content
A basic memory test flow should look like this:
{
"jsonrpc": "2.0",
"id": 3,
"method": "tools/call",
"params": {
"name": "memory_store",
"arguments": {
"key": "architecture.decision.api-versioning",
"value": "Use URL-based API versioning for public endpoints."
}
}
}
Then retrieve it:
{
"jsonrpc": "2.0",
"id": 4,
"method": "tools/call",
"params": {
"name": "memory_get",
"arguments": {
"key": "architecture.decision.api-versioning"
}
}
}
Step 4: mock LLM providers during CI
Ruflo calls an LLM provider for routing and agent work. CI should not depend on a live model provider for every commit.
Use Apidog to mock the provider endpoint with stable responses. Then point Ruflo’s provider config at the mock during tests.
This gives you:
- Repeatable CI behavior
- No token usage during contract tests
- Faster test runs
- Easier failure debugging
The same pattern is described in API testing without Postman.
Step 5: run the suite in CI
Run your Apidog test collection in CI so MCP regressions fail before they reach your team.
Example GitHub Actions structure:
name: Ruflo MCP Contract Tests
on:
pull_request:
push:
branches:
- main
jobs:
mcp-contract:
runs-on: ubuntu-latest
steps:
- name: Checkout
uses: actions/checkout@v4
- name: Install Node.js
uses: actions/setup-node@v4
with:
node-version: 20
- name: Initialize Ruflo
run: npx ruvflo init
- name: Run Apidog tests
run: apidog run
Adjust the runner command to match your Apidog workspace and authentication setup.
Where Apidog fits in the daily Ruflo loop
Apidog is useful beyond CI in three common debugging workflows.
When a swarm misbehaves
Replay the exact tools/call sequence Claude Code sent to Ruflo.
Compare it with a known-good run. The diff often shows:
- A changed tool argument
- A prompt template drift
- A missing memory value
- A tool schema change
When you upgrade Ruflo
Before adopting a new Ruflo release:
- Run your Apidog MCP suite.
- Compare
tools/listagainst the previous version. - Identify renamed, removed, or changed tools.
- Update agent prompts or test expectations.
This is the same workflow used for API contract diffs in contract-first API development.
When federation flakes
Federated agents communicate across machines. Debugging failures without request visibility is difficult.
Point Apidog at the local proxy port and record the traffic. Then inspect:
- Handshake failures
- Unexpected payload shape
- Missing auth or encryption metadata
- Incorrect destination agent
Common pitfalls
Installing the plugin path and expecting the full runtime
The plugin path gives you slash commands and agent definitions. It does not give you the full MCP runtime.
If swarm_init is not callable from Claude Code, run:
npx ruvflo init
Skipping or overriding hooks
The full install uses hooks to route tasks automatically. If you remove or override them, the router may never run.
Keep the default hooks until you have a clear reason to customize them.
Letting memory grow unchecked
Persistent memory is useful, but it needs lifecycle management.
Add a retention policy for:
- Old sessions
- Temporary task memory
- Failed experiments
- Low-value generated context
If memory queries become slow, inspect the backing store and consider moving from the default local setup to a more scalable backend supported by your Ruflo configuration.
Treating Ruflo as Claude-only
Ruflo started in the Claude Code ecosystem, but it is provider-agnostic. Configure the provider that fits your workflow.
For related provider setup patterns, see the DeepSeek V4 API guide and the best local LLMs of 2026.
Forgetting that federation crosses trust boundaries
Federation can send payloads to another machine. Those payloads may include code, prompts, metadata, or task context.
Before enabling federation, define:
- Which projects can federate
- Which machines are trusted
- Which data must be scrubbed
- Who reviews audit logs
- How credentials and secrets are excluded
Ruflo vs other agent frameworks
LangGraph
LangGraph is lower-level and more generic. You build the orchestration yourself.
Pick LangGraph when:
- You need full control over the graph.
- Your workflow is not centered on Claude Code.
- You are comfortable building more orchestration logic.
See the related TradingAgents post for another multi-agent workflow.
CrewAI
CrewAI is framework-agnostic and configuration-heavy compared with Ruflo.
Pick CrewAI when:
- Python is your primary environment.
- You are not building around Claude Code.
- You want a standalone multi-agent framework.
Manual MCP server stacks
You can manually wire several MCP servers together. This is fine for small setups.
It gets harder when you need:
- Shared memory
- Tool routing
- Multi-agent coordination
- Federation
- Repeatable agent roles
Ruflo’s niche is specific: Claude Code with swarm coordination.
Performance and scale notes
Swarm startup has overhead. For short tasks, routing into a swarm can cost more than it saves.
Good candidates for single-agent mode:
- One-line edits
- Small formatting changes
- Simple file lookups
- Direct questions
Good candidates for swarm mode:
- Refactors
- Security reviews
- Test strategy
- Cross-module debugging
- Documentation plus implementation work
Memory also needs attention as usage grows. If queries slow down, review:
- Store size
- Retention policy
- Indexing
- Backend choice
- Whether semantic search is needed
Real-world use cases
Platform security review
A platform team can run a security-review swarm on one repository while a refactoring swarm works on another. Shared memory lets both workflows surface conflicting recommendations to a human reviewer.
Ticket queue automation
A solo developer can use autopilot mode with a ticket queue:
Pick a P3 ticket.
Check out the code.
Propose a fix.
Open a PR.
Move to the next ticket.
The developer reviews the results later instead of driving every step manually.
Multi-repo PR review
A research or engineering group can use a multi-agent review pattern across several repositories:
- One agent reviews correctness.
- One agent reviews tests.
- One agent reviews maintainability.
- One agent summarizes risk.
Implementation checklist
Use this checklist for a safe rollout.
[ ] Create a scratch project.
[ ] Run npx ruvflo init.
[ ] Confirm Claude Code can see Ruflo MCP tools.
[ ] Capture initialize and tools/list frames.
[ ] Capture swarm_init, memory_store, and memory_get calls.
[ ] Save requests in Apidog.
[ ] Add JSONPath assertions.
[ ] Mock the LLM provider for CI.
[ ] Add the Apidog runner to CI.
[ ] Define memory retention.
[ ] Define federation policy before enabling cross-machine workflows.
Conclusion
Ruflo answers a specific scaling problem: how to move Claude Code beyond one agent at a time.
The full CLI install adds:
- Swarm coordination
- Persistent memory
- Hooks
- MCP tools
- Federation support
- Plugin-based capabilities
The most important implementation detail is the MCP server. It is the contract between Claude Code and Ruflo, so test it like any other JSON-RPC API.
Next step:
npx ruvflo init
Run it in a scratch project, capture the MCP frames in Claude Code’s inspector, and save them in an Apidog project. Once the contract tests pass locally, wire them into CI.
FAQ
Is Ruflo the same as claude-flow?
Yes. Ruflo is the renamed claude-flow project maintained by rUv. The npm package is ruvflo, and the GitHub repository is ruvnet/ruflo.
Do I need both the plugin and the CLI install?
No. Pick one.
Use the plugin path for slash commands and lightweight evaluation. Use the CLI install for the full coordination layer.
Can I use Ruflo without Claude?
Yes. Ruflo is provider-agnostic. Claude is the default because the project grew out of claude-flow, but provider configuration can point Ruflo at other supported models.
Where does memory live?
Memory lives in the storage backend configured for your Ruflo setup, such as local SQLite or Postgres. The optional RuVector backend adds vector search for semantic retrieval.
Memory does not go to a third-party service unless you explicitly configure it that way.
How do I test the MCP server in CI?
Capture canonical MCP requests with Claude Code’s MCP inspector, save them in Apidog, add assertions, and run the collection in CI.
The full pattern is covered in the MCP server testing playbook.
Is federation safe across organizations?
The encryption layer is only one part of the problem. You still need policy controls.
Before using federation across organizations:
- Define trusted endpoints.
- Scrub secrets from payloads.
- Restrict which projects can federate.
- Review audit logs.
- Document ownership and approval rules.
What does Ruflo cost?
The framework is MIT-licensed and free. Your main operating cost is LLM usage, plus any hosted storage or vector database you choose to run.



Top comments (0)