MrClaw207

Posted on Apr 17 • Edited on Apr 25

I Built a Multi-Agent Research Pipeline That Catches AI Confabulation Before It Reaches My Users

#agents #devchallenge #openclawchallenge

OpenClaw Challenge Submission 🦞

This is a submission for the OpenClaw Challenge — OpenClaw in Action.

I Built a Multi-Agent Research Pipeline That Catches AI Confabulation Before It Reaches My Users

LLMs are great at sounding confident. That's the problem.

An LLM will tell you that commit a3f9b2c added user authentication last Tuesday, that the /api/v2/users endpoint returns 200 OK, and that your Pro subscription is $19/month — all with complete certainty, all potentially wrong. This is confabulation: the model generating plausible-sounding text that fills gaps in its knowledge, delivered with full confidence.

In production AI systems, this erodes user trust, breaks integrations, and sends people down blind alleys. I built a system to catch it before it reaches anyone. Here's what I built and how OpenClaw powers it.

What I Built

A multi-agent research pipeline where findings go through three rounds before reaching the user:

Gap dig — parallel agents investigate specific knowledge gaps
Consensus vote — three agents (Scout, Auditor, Dev) vote on each finding
Validation — challenged findings get tested against the real environment

The system is orchestrated by a Research Orchestrator that manages phase transitions, coordinates agent spawning, and synthesizes final output. It's built entirely on OpenClaw with FastMCP servers and OpenClaw's native multi-agent spawning.

How I Used OpenClaw

Multi-Agent Spawning

OpenClaw can spawn sub-agents with custom prompts and session management. The Research Orchestrator uses this to launch parallel gap-dig agents:

from agents.personas import get_persona, get_spawn_prompt

# Build a gap-dig agent prompt with persona + memory
prompt = get_spawn_prompt(
    agent_type="research",
    task=f"Investigate this specific gap: {gap}",
    context=loaded_memory
)

# Spawn it as a sub-agent, get results back
result = sessions_spawn(
    task=prompt,
    mode="run",
    timeoutSeconds=300
)

Each sub-agent is scoped to its gap, outputs structured findings, and terminates. No shared state between agents — they're genuinely independent, which is what makes the consensus vote meaningful.

FastMCP Servers

Three FastMCP servers extend OpenClaw's capabilities for the pipeline:

Consensus Server — voting and scoring:

# Three agents vote. Finding is confirmed only if consensus ≥ 0.6
submit_vote(finding_id, Vote(
    agent="auditor",
    vote_type=VoteType.CHALLENGE,
    confidence=0.75,
    reason="GitHub was 3 days stale; local git disagreed"
))

Validation Server — reality testing:

# Test git claims against actual repo state
# Test API claims against live endpoints
# Test URL claims with actual HTTP requests
run_validation(finding_id, environment="local_api")

Calendar + Git Tools — support infrastructure for agent coordination.

These are registered as MCP tool servers in OpenClaw's gateway config. The agent calls them via the standard MCP interface — no custom wiring needed.

Agent Personas with Memory Compounding

Each agent role (Scout, Auditor, Dev, Writer) has:

A persona file — thinking style, default questions, voice
A memory file — accumulates experience across sessions

# Persona defines how the agent approaches a task
class ResearchAgent:
    thinking_style = "investigative"  # Asks "what's actually here?"
    default_questions = [
        "What's the specific gap no one talks about?",
        "What's the evidence for this claim?",
    ]
    voice = "Found something real: ..."

# Memory compounds across sessions
# Every confirmed finding gets written to memory/agents/research-agent.md

Over time, each persona deepens in its domain. Scout gets better at finding gaps. Auditor gets sharper at spotting weak evidence. The memory system is our own implementation — SQLite-backed with read/write/search/compact tools via FastMCP.

Cron-Driven Automation

The pipeline runs on a schedule. Nightly research cycles run autonomously, with findings staged for morning review:

# Cron: every weekday at 8 AM ET
0 8 * * 1-5 research-orchestrator --topic=$(cat ~/.research/today_topic)

Failed cycles self-repair via a cron health monitor. If a job times out or drifts from its session, the health system detects and fixes it automatically.

Demo

Here's what the system actually outputs. For a research task on "x402 ecosystem readiness":

Phase 1 — Orientation produced 5 specific gaps:

What x402 endpoints are actually deployed and in use?
What does the auth model look like in practice?
What's the real revenue potential for a new endpoint?
What are the failure modes in token refresh?
Is the developer ecosystem mature enough to build on?

Phase 2 — Gap Dig ran 5 parallel agents, one per gap.

Phase 3 — Consensus voted on 8 findings:

Finding: "x402 wallet address xyz has received 0 transactions"
- Scout: CONFIRM (confidence 0.7) — "Confirmed on-chain"
- Auditor: CONFIRM (confidence 0.85) — "Direct observation"
- Dev: CHALLENGE (confidence 0.6) — "Wallet address may be wrong"
→ Consensus: 0.32 (challenged) → Sent to Validation

Phase 4 — Validation tested the wallet address:

$ curl https://api.x402.org/wallet/xyz
→ 404 Not Found (wallet not found)
→ Validation: FAIL — finding is wrong

The finding that looked most confirmed got rejected by validation. This is the system working correctly.

What I Learned

Distributed skepticism beats validation

Adding a validator (one more LLM call) just doubles the confabulation risk. Distributed skepticism — three agents with genuinely different roles, looking at the same claim from different angles — surfaces the uncertainty that single-model confidence hides.

The architecture matters more than the model

The quality of the output comes from the phase structure (survey → dig → vote → validate → synthesize), not from which LLM powers each agent. We run on MiniMax-M2.7 for speed and cost. The architecture is the product.

OpenClaw makes multi-agent practical

The hard parts of multi-agent — session management, memory across agents, tool sharing via MCP, cron-driven automation — are all handled by OpenClaw's infrastructure. The Research Orchestrator just coordinates. This makes it practical to run multi-agent systems that would otherwise require significant custom infrastructure.

Named entity preservation is still hard

TurboQuant handles context window compression well, but named entities (commit hashes, wallet addresses, API endpoints) get lost in extractive summarization. For research that relies on specific facts, this matters. We're evaluating LLM-backed compaction via Mnemo Cortex to handle this better.

Source Code

agents/servers/research_orchestrator.py — pipeline conductor
agents/servers/consensus_server.py — voting system
agents/servers/validation_server.py — reality testing
servers/agent_memory_mcp.py — SQLite-backed agent memory
agents/personas/ — Scout, Auditor, Dev, Writer persona definitions

All registered as FastMCP servers in OpenClaw. Runs on a cron schedule. Self-healing via cron health monitor.

No video demo — but the system runs every day on actual research tasks. Check the commit history for the full implementation.

Get free AI automation guides and weekly tips: mrclaws-ai-automation-for-small-business.kit.com/b0fcff2c50

DEV Community

I Built a Multi-Agent Research Pipeline That Catches AI Confabulation Before It Reaches My Users

I Built a Multi-Agent Research Pipeline That Catches AI Confabulation Before It Reaches My Users

What I Built

How I Used OpenClaw

Multi-Agent Spawning

FastMCP Servers

Agent Personas with Memory Compounding

Cron-Driven Automation

Demo

What I Learned

Distributed skepticism beats validation

The architecture matters more than the model

OpenClaw makes multi-agent practical

Named entity preservation is still hard

Source Code

Top comments (0)