Hassann

Posted on May 7 • Originally published at apidog.com

You're Using Claude Code Wrong. Ruflo Fixes It.

If you have been tracking the Claude Code ecosystem, you have probably seen Ruflo move from an interesting npm package to a coordination layer for teams running Claude Code seriously. Ruflo, maintained by rUv, grew out of the original claude-flow project. Claude Code runs one agent at a time by default; Ruflo adds orchestration so Claude Code can coordinate multiple agents as a swarm.

Try Apidog today

This guide shows what Ruflo does, when to install it, how the MCP layer works, and how to test Ruflo’s MCP traffic with Apidog. If you are new to the agent file format Claude Code reads on boot, start with the agents.md guide.

TL;DR

Ruflo, formerly claude-flow, is a multi-agent orchestration platform for Claude Code by rUv.
npx ruvflo init adds a coordination layer for swarms, persistent memory, hooks, MCP tooling, and federation.
There are two install paths:
- Claude Code Plugin: lightweight slash commands and agent definitions.
- CLI install: full Ruflo runtime, MCP server, hooks, memory, and federation.
Ruflo’s MCP server is the contract surface you should test.
Use Apidog to capture initialize, tools/list, and tools/call requests, add assertions, mock LLM providers, and run checks in CI.
Download Apidog if you want a contract-testing layer before Ruflo becomes part of your daily workflow.

What Ruflo actually does

Claude Code is normally a single-agent loop:

You send a task.
Claude edits one workspace.
The session ends.
Context does not automatically persist across future sessions.

That works for small tasks. It becomes harder when you want:

A security agent, test agent, and docs agent to review the same change.
One session’s findings to inform a later session.
Work to be coordinated across multiple machines.

Ruflo plugs into Claude Code as an orchestration layer. After initialization, tasks can be routed to one of several execution paths:

Run as a normal single-agent Claude Code task.
Spawn a swarm of specialist agents.
Resume from persistent memory.
Federate work to another agent or machine.

The README describes Ruflo as “Claude Code with a nervous system.” That is the right mental model: Ruflo does not replace Claude Code. It adds routing, memory, and coordination around it.

Ruflo architecture

The simplified flow from the README is:

User -> Ruflo (CLI/MCP) -> Router -> Swarm -> Agents -> Memory -> LLM Providers
                       ^                          |
                       +---- Learning Loop <------+

For implementation and testing, focus on these components.

CLI and MCP entry points

You can drive Ruflo from the CLI or through Claude Code’s MCP integration. Both surfaces eventually exercise the same underlying tool calls.

Router

The router decides how a task should run:

Single agent
Swarm
Resume from memory
Federated execution

This is the component to inspect when simple tasks are being over-orchestrated or complex tasks are not being split into agents.

Swarm

A swarm is a set of specialist agents with focused prompts and tool access. For example, a code-review swarm might include:

Security reviewer
Performance reviewer
Test reviewer
Documentation reviewer
Synthesizer agent

Memory

Ruflo persists memory across sessions. Future agents can query that memory to reuse useful context and patterns.

LLM providers

Ruflo is provider-agnostic. Claude is the default, but other providers can be configured through the standard provider configuration.

Install paths

Ruflo has two installation paths. Pick based on how much orchestration you need.

Path A: Claude Code Plugin

Install through the Claude Code marketplace:

/plugin install ruflo-core@ruflo

This gives you:

Slash commands
Agent definitions

It does not register the full Ruflo MCP server. That means tools such as memory_store, swarm_init, and agent_spawn are not available to Claude Code as callable MCP tools.

Use this path when you only want to evaluate a plugin or try Ruflo commands in isolation.

Path B: CLI install

Run this in your project:

npx ruvflo init

This sets up the full runtime, including:

.claude/
.claude-flow/
CLAUDE.md
Helper scripts
MCP server registration
Hooks
Persistent memory
Swarm coordination
Federation support

After this, you use Claude Code normally. Ruflo’s hooks route tasks automatically.

For most engineering teams using Claude Code daily, the CLI install is the practical path.

What ships with Ruflo

Ruflo is organized around core primitives and plugins.

`ruflo-core`

The foundation layer. It provides primitives such as:

Memory storage
Swarm initialization
Agent spawning

`ruflo-swarm`

Multi-agent coordination with role specialization.

Example use case:

Run a code-review swarm with:
- security reviewer
- performance reviewer
- docs reviewer
- test reviewer
- final synthesizer

`ruflo-autopilot`

Long-running task automation. You give Ruflo a goal, and it iterates with checkpoints.

`ruflo-federation`

Agent-to-agent communication across machines. Use this carefully because federation can cross trust boundaries.

RuVector

RuVector is the vector store and graph backend used by the memory layer. It becomes more useful as your accumulated session context grows.

The plugin marketplace also includes packs for testing, security, refactoring, and observability. The pattern is consistent: each plugin adds a focused capability on top of the core memory and swarm primitives.

Why the MCP layer matters

Ruflo’s MCP server connects the framework to Claude Code’s runtime.

Every important operation becomes a JSON-RPC call against the local MCP server, including:

Tool discovery
Swarm creation
Agent spawning
Memory reads and writes
Federated handoffs

That makes the MCP API the contract surface.

If tools/list breaks, Claude Code may stop seeing Ruflo’s tools. If memory_store returns the wrong shape, agents may retrieve incorrect or unusable context.

This is the same testing problem covered in the MCP server testing playbook. Treat Ruflo’s MCP server like any other JSON-RPC API.

Test Ruflo’s MCP server with Apidog

Here is a practical test plan.

Step 1: initialize Ruflo in a scratch project

mkdir ruflo-mcp-test
cd ruflo-mcp-test
npx ruvflo init

Then open Claude Code with Ruflo active and run a few representative tasks:

Review this module for security issues.
Create a test plan for this endpoint.
Store this architectural decision for future sessions.

Use Claude Code’s MCP inspector to capture JSON-RPC frames for:

initialize
tools/list
tools/call with swarm_init
tools/call with memory_store
tools/call with memory_get

Step 2: save the requests in Apidog

Create a new project in Apidog, set the base URL to your local Ruflo MCP server, and save each captured JSON-RPC request.

Example tools/list body:

{
  "jsonrpc": "2.0",
  "id": 1,
  "method": "tools/list",
  "params": {}
}

Example tools/call body for a swarm initialization:

{
  "jsonrpc": "2.0",
  "id": 2,
  "method": "tools/call",
  "params": {
    "name": "swarm_init",
    "arguments": {
      "task": "Review the API module for security and test coverage"
    }
  }
}

Use the exact request shapes captured from your local Ruflo install. Do not hand-write assumptions if the inspector gives you canonical traffic.

Step 3: add assertions

Add assertions for the key MCP responses.

For initialize, assert:

result.serverInfo.name exists
result.protocolVersion exists

If your team standardizes on a specific server name or protocol version, assert the exact values.

For tools/list, assert:

result.tools is an array
result.tools.length > 0
each tool has name
each tool has description
each tool has inputSchema

For swarm_init, assert:

response is not an error
result contains a swarm identifier or successful initialization payload

For memory_store, assert:

write succeeds
stored key can be retrieved with memory_get
retrieved value matches expected content

A basic memory test flow should look like this:

{
  "jsonrpc": "2.0",
  "id": 3,
  "method": "tools/call",
  "params": {
    "name": "memory_store",
    "arguments": {
      "key": "architecture.decision.api-versioning",
      "value": "Use URL-based API versioning for public endpoints."
    }
  }
}

Then retrieve it:

{
  "jsonrpc": "2.0",
  "id": 4,
  "method": "tools/call",
  "params": {
    "name": "memory_get",
    "arguments": {
      "key": "architecture.decision.api-versioning"
    }
  }
}

Step 4: mock LLM providers during CI

Ruflo calls an LLM provider for routing and agent work. CI should not depend on a live model provider for every commit.

Use Apidog to mock the provider endpoint with stable responses. Then point Ruflo’s provider config at the mock during tests.

This gives you:

Repeatable CI behavior
No token usage during contract tests
Faster test runs
Easier failure debugging

The same pattern is described in API testing without Postman.

Step 5: run the suite in CI

Run your Apidog test collection in CI so MCP regressions fail before they reach your team.

Example GitHub Actions structure:

name: Ruflo MCP Contract Tests

on:
  pull_request:
  push:
    branches:
      - main

jobs:
  mcp-contract:
    runs-on: ubuntu-latest

    steps:
      - name: Checkout
        uses: actions/checkout@v4

      - name: Install Node.js
        uses: actions/setup-node@v4
        with:
          node-version: 20

      - name: Initialize Ruflo
        run: npx ruvflo init

      - name: Run Apidog tests
        run: apidog run

Adjust the runner command to match your Apidog workspace and authentication setup.

Where Apidog fits in the daily Ruflo loop

Apidog is useful beyond CI in three common debugging workflows.

When a swarm misbehaves

Replay the exact tools/call sequence Claude Code sent to Ruflo.

Compare it with a known-good run. The diff often shows:

A changed tool argument
A prompt template drift
A missing memory value
A tool schema change

When you upgrade Ruflo

Before adopting a new Ruflo release:

Run your Apidog MCP suite.
Compare tools/list against the previous version.
Identify renamed, removed, or changed tools.
Update agent prompts or test expectations.

This is the same workflow used for API contract diffs in contract-first API development.

When federation flakes

Federated agents communicate across machines. Debugging failures without request visibility is difficult.

Point Apidog at the local proxy port and record the traffic. Then inspect:

Handshake failures
Unexpected payload shape
Missing auth or encryption metadata
Incorrect destination agent

Common pitfalls

Installing the plugin path and expecting the full runtime

The plugin path gives you slash commands and agent definitions. It does not give you the full MCP runtime.

If swarm_init is not callable from Claude Code, run:

npx ruvflo init

Skipping or overriding hooks

The full install uses hooks to route tasks automatically. If you remove or override them, the router may never run.

Keep the default hooks until you have a clear reason to customize them.

Letting memory grow unchecked

Persistent memory is useful, but it needs lifecycle management.

Add a retention policy for:

Old sessions
Temporary task memory
Failed experiments
Low-value generated context

If memory queries become slow, inspect the backing store and consider moving from the default local setup to a more scalable backend supported by your Ruflo configuration.

Treating Ruflo as Claude-only

Ruflo started in the Claude Code ecosystem, but it is provider-agnostic. Configure the provider that fits your workflow.

For related provider setup patterns, see the DeepSeek V4 API guide and the best local LLMs of 2026.

Forgetting that federation crosses trust boundaries

Federation can send payloads to another machine. Those payloads may include code, prompts, metadata, or task context.

Before enabling federation, define:

Which projects can federate
Which machines are trusted
Which data must be scrubbed
Who reviews audit logs
How credentials and secrets are excluded

Ruflo vs other agent frameworks

LangGraph

LangGraph is lower-level and more generic. You build the orchestration yourself.

Pick LangGraph when:

You need full control over the graph.
Your workflow is not centered on Claude Code.
You are comfortable building more orchestration logic.

See the related TradingAgents post for another multi-agent workflow.

CrewAI

CrewAI is framework-agnostic and configuration-heavy compared with Ruflo.

Pick CrewAI when:

Python is your primary environment.
You are not building around Claude Code.
You want a standalone multi-agent framework.

Manual MCP server stacks

You can manually wire several MCP servers together. This is fine for small setups.

It gets harder when you need:

Shared memory
Tool routing
Multi-agent coordination
Federation
Repeatable agent roles

Ruflo’s niche is specific: Claude Code with swarm coordination.

Performance and scale notes

Swarm startup has overhead. For short tasks, routing into a swarm can cost more than it saves.

Good candidates for single-agent mode:

One-line edits
Small formatting changes
Simple file lookups
Direct questions

Good candidates for swarm mode:

Refactors
Security reviews
Test strategy
Cross-module debugging
Documentation plus implementation work

Memory also needs attention as usage grows. If queries slow down, review:

Store size
Retention policy
Indexing
Backend choice
Whether semantic search is needed

Real-world use cases

Platform security review

A platform team can run a security-review swarm on one repository while a refactoring swarm works on another. Shared memory lets both workflows surface conflicting recommendations to a human reviewer.

Ticket queue automation

A solo developer can use autopilot mode with a ticket queue:

Pick a P3 ticket.
Check out the code.
Propose a fix.
Open a PR.
Move to the next ticket.

The developer reviews the results later instead of driving every step manually.

Multi-repo PR review

A research or engineering group can use a multi-agent review pattern across several repositories:

One agent reviews correctness.
One agent reviews tests.
One agent reviews maintainability.
One agent summarizes risk.

Implementation checklist

Use this checklist for a safe rollout.

[ ] Create a scratch project.
[ ] Run npx ruvflo init.
[ ] Confirm Claude Code can see Ruflo MCP tools.
[ ] Capture initialize and tools/list frames.
[ ] Capture swarm_init, memory_store, and memory_get calls.
[ ] Save requests in Apidog.
[ ] Add JSONPath assertions.
[ ] Mock the LLM provider for CI.
[ ] Add the Apidog runner to CI.
[ ] Define memory retention.
[ ] Define federation policy before enabling cross-machine workflows.

Conclusion

Ruflo answers a specific scaling problem: how to move Claude Code beyond one agent at a time.

The full CLI install adds:

Swarm coordination
Persistent memory
Hooks
MCP tools
Federation support
Plugin-based capabilities

The most important implementation detail is the MCP server. It is the contract between Claude Code and Ruflo, so test it like any other JSON-RPC API.

Next step:

npx ruvflo init

Run it in a scratch project, capture the MCP frames in Claude Code’s inspector, and save them in an Apidog project. Once the contract tests pass locally, wire them into CI.

FAQ

Is Ruflo the same as claude-flow?

Yes. Ruflo is the renamed claude-flow project maintained by rUv. The npm package is ruvflo, and the GitHub repository is ruvnet/ruflo.

Do I need both the plugin and the CLI install?

No. Pick one.

Use the plugin path for slash commands and lightweight evaluation. Use the CLI install for the full coordination layer.

Can I use Ruflo without Claude?

Yes. Ruflo is provider-agnostic. Claude is the default because the project grew out of claude-flow, but provider configuration can point Ruflo at other supported models.

Where does memory live?

Memory lives in the storage backend configured for your Ruflo setup, such as local SQLite or Postgres. The optional RuVector backend adds vector search for semantic retrieval.

Memory does not go to a third-party service unless you explicitly configure it that way.

How do I test the MCP server in CI?

Capture canonical MCP requests with Claude Code’s MCP inspector, save them in Apidog, add assertions, and run the collection in CI.

The full pattern is covered in the MCP server testing playbook.

Is federation safe across organizations?

The encryption layer is only one part of the problem. You still need policy controls.

Before using federation across organizations:

Define trusted endpoints.
Scrub secrets from payloads.
Restrict which projects can federate.
Review audit logs.
Document ownership and approval rules.

What does Ruflo cost?

The framework is MIT-licensed and free. Your main operating cost is LLM usage, plus any hosted storage or vector database you choose to run.