Mike

Posted on Jan 27 • Edited on Feb 4

🦆 Stop Copy-Pasting Between AI Tabs — Use MCP Rubber Duck Instead

#ai #llm #mcp #opensource

You're debugging a nasty async bug. You explain it to an LLM. It gives you a confident answer.

But is it the right answer?

You could paste the same question into GPT, Gemini, and Llama to triple-check. But now you're juggling tabs, reformatting prompts, and losing context.

What if one command could ask them all — and show you where they agree and where they don't?

Different LLMs behave differently. GPT-5.1 might reach for one pattern, Gemini for another, and Llama for something else entirely. Sometimes the "wrong" model spots what the "right" one glossed over.

That's rubber-duck debugging meets AI — except now the duck talks back, and you get a whole panel instead of one.

What is MCP Rubber Duck?

MCP Rubber Duck is an open-source server that lets you query multiple LLMs at once through a single interface. Instead of switching between ChatGPT, Claude, and Gemini tabs, you send one prompt and see them all respond side-by-side.

(MCP is a standard protocol that lets AI tools like Claude Desktop use external servers — think of it like plugins for your AI assistant.)

Think of it as rubber duck debugging, but your ducks are AI models that can:

Surface different explanations and approaches
Highlight trade-offs between their proposed solutions
Rank candidate solutions with confidence scores
Refine answers over multiple rounds based on feedback from all models

Why This Actually Helps

We've all copy-pasted an answer from one model into another tab to "double-check" it. That's not overkill — it's rational. Different models have different training data, biases, and strengths.

The problem? It's tedious. You're juggling tabs, retyping prompts, and manually eyeballing differences between answers.

MCP Rubber Duck turns that copy-paste routine into a single command, with all the responses in one place.

Features That Matter

🦆 Duck Council

Get responses from all your configured ducks in a single call. Ask once, compare multiple perspectives side-by-side.

await duck_council({
  prompt: "Should I use REST or GraphQL for my API?"
});

Perfect for: architecture decisions, library comparisons, sanity-checking high-impact choices.

🗳️ Multi-Duck Voting

Have your ducks vote on concrete options, with reasoning and confidence scores. Perfect when you're choosing between stacks, databases, or deployment strategies.

await duck_vote({
  question: "Best database for real-time chat?",
  options: ["PostgreSQL", "MongoDB", "Redis", "Cassandra"]
});

Example output:

🗳️ DUCK VOTE RESULTS
━━━━━━━━━━━━━━━━━━━━━

📊 Vote Tally:
  Redis:      ██████████ 2 votes (67%)
  PostgreSQL: █████ 1 vote (33%)

🏆 Winner: Redis (Majority Consensus)

🦆 GPT Duck: "Redis for pub/sub and sub-ms latency..."
🦆 Gemini Duck: "Redis is purpose-built for real-time..."
🦆 Groq Duck: "PostgreSQL with LISTEN/NOTIFY could work..."

⚖️ LLM-as-Judge

Let one duck act as judge: it scores, ranks, and critiques the other ducks' answers against your criteria (correctness, depth, clarity, etc.).

const evaluation = await duck_judge({
  responses: councilResponses,  // from duck_council
  criteria: ["accuracy", "completeness", "practicality"],
  persona: "senior backend engineer"
});
// Returns ranked responses with scores and critique

Use it to automatically pick the strongest response, or get structured peer review on code, designs, and specs.

🔄 Iterative Refinement

Run an automatic critique-and-revise loop between two ducks: one attacks the draft with detailed feedback, the other rewrites it. You control the number of rounds.

const polished = await duck_iterate({
  prompt: "Write a migration plan for Postgres to MongoDB",
  iterations: 3,
  mode: "critique-improve"
});

Great for turning rough notes into solid design docs, or iterating on prompts until they stop sucking.

🎓 Structured Debates

Run Oxford-style, Socratic, or adversarial debates between your ducks on any technical question. You define the motion, rules, and number of rounds.

const debate = await duck_debate({
  topic: "Microservices vs monolith for a 5-person startup",
  format: "oxford",  // or "socratic", "adversarial"
  rounds: 3
});
// Returns full transcript + summary of agreements, disagreements, and trade-offs

Use it to stress-test your architecture or validate trade-offs before you commit. It sounds silly until you watch GPT and Gemini argue about whether you really need Kubernetes.

See It In Action: A Real Example

Let's say you're choosing a database for a real-time chat feature. Here's the full workflow:

Step 1: Ask the council

const council = await duck_council({
  prompt: "Best database for real-time chat with 10K concurrent users?"
});

Step 2: Get multiple perspectives

🦆 GPT Duck: "Redis is ideal — sub-millisecond latency, built-in pub/sub..."
🦆 Gemini Duck: "Redis for real-time, but consider Postgres for persistence..."
🦆 Groq Duck: "ScyllaDB if you need both speed and durability at scale..."

Step 3: Let them vote

const vote = await duck_vote({
  question: "Best primary database for real-time chat?",
  options: ["Redis", "PostgreSQL", "ScyllaDB", "MongoDB"]
});

Step 4: See the consensus

🏆 Winner: Redis (67% majority)
📊 GPT Duck: Redis (confidence: 85%) — "Pub/sub is purpose-built for this"
📊 Gemini Duck: Redis (confidence: 78%) — "Latency requirements favor Redis"
📊 Groq Duck: ScyllaDB (confidence: 65%) — "Better durability trade-off"

Your decision: Redis for the real-time layer, with the Groq Duck's point about durability noted for your persistence strategy.

This took 30 seconds instead of 30 minutes of tab-switching.

Setup

Prerequisites:

Node.js 18+ (node -v to check)
Claude Desktop installed (download)
At least one LLM API key (OpenAI, Gemini, etc.)

Step 1: Install globally

npm install -g mcp-rubber-duck

Step 2: Find your Claude Desktop config

Open claude_desktop_config.json:

macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
Windows: %APPDATA%\Claude\claude_desktop_config.json

Step 3: Add the Rubber Duck server

{
  "mcpServers": {
    "rubber-duck": {
      "command": "mcp-rubber-duck",
      "env": {
        "MCP_SERVER": "true",
        "OPENAI_API_KEY": "your-openai-key",
        "GEMINI_API_KEY": "your-gemini-key"
      }
    }
  }
}

You can use just one provider — omit any keys you don't have.

Step 4: Restart and test

Restart Claude Desktop
Look for "rubber-duck" in the MCP servers list
Try: "Ask my duck panel: what's the best way to handle errors in async JavaScript?"

If it doesn't appear, check the JSON syntax and run mcp-rubber-duck in a terminal to see errors.

Advanced: MCP Bridge (Optional)

Once your basic duck is working, you can give it superpowers by connecting to other MCP servers (documentation search, filesystem access, databases).

For example, with a docs server connected:

You: "Find React hooks docs and summarize the key patterns."

Duck: fetches 5,000 tokens of docs, returns 500 tokens of essentials

This keeps your context window clear and costs down.

Quick setup: Add to your env variables:

"MCP_BRIDGE_ENABLED": "true",
"MCP_SERVER_CONTEXT7_URL": "https://mcp.context7.com/mcp"

See the full MCP Bridge docs for details.

Built-in Safety

MCP Rubber Duck includes guardrails so you don't accidentally blow your budget:

Rate limiting — cap requests per model
Token ceilings — hard limits on prompt/response size
PII redaction — optional filters for emails, secrets, IDs

Configure via environment variables. See docs for details.

When to Use This

Reach for MCP Rubber Duck when:

You're stuck on a tricky bug and want 3-4 hypotheses side-by-side
You're making architecture decisions and want multiple models to critique your approach
You're evaluating which model works best for your specific prompts
You're building AI workflows that need redundancy (if Model A hallucinates, B and C save the run)

Maybe skip it if:

You're happy with single-model responses
You don't want to manage multiple API keys
You're on a tight budget (3 models = 3x API costs)

What's Next

The project is actively maintained. Coming soon:

Weighted voting — models that perform better count more
Domain-specific duck personas — pre-tuned for security review, code review, docs
Disagreement detection — alerts when your ducks strongly disagree

Track progress in GitHub Issues.

The Bottom Line

If you've ever:

pasted the same question into three AI tabs, or
argued with one model about a subtle bug for 30 minutes

...MCP Rubber Duck turns that whole dance into a single command with a duck panel.

It's free, open-source, and yes — it's got ducks.

Try It Now

npm install -g mcp-rubber-duck

⭐ Star on GitHub — helps more devs find it

📖 Read the docs — full setup and API reference

Your Turn

I'd love to hear:

Which models are in your duck panel?
What's the wildest disagreement your ducks have had?

Drop your setup or favorite prompt in the comments — I might feature the best ones in a follow-up post!

What's Next

In Part 2, I take ducks further — using MCP Rubber Duck as middleware between Claude Code and Chrome DevTools MCP to keep thousands of DOM tokens out of the context window:

👉 Stop Feeding DOM Snapshots to Claude — Use a Rubber Duck Instead

DEV Community

🦆 Stop Copy-Pasting Between AI Tabs — Use MCP Rubber Duck Instead

What is MCP Rubber Duck?

Why This Actually Helps

Features That Matter

🦆 Duck Council

🗳️ Multi-Duck Voting

⚖️ LLM-as-Judge

🔄 Iterative Refinement

🎓 Structured Debates

See It In Action: A Real Example

Setup

Advanced: MCP Bridge (Optional)

Built-in Safety

When to Use This

What's Next

The Bottom Line

Try It Now

Your Turn

What's Next

Top comments (0)