The Two-Agent Review Pattern: Why AI Agents Shouldn't Verify Their Own Output

#ai #agents #programming #productivity

The Two-Agent Review Pattern: Why AI Agents Shouldn't Verify Their Own Output

There's a subtle failure mode that shows up in AI agent systems around the 30-day mark: the agent starts reporting success on outputs that are technically correct but contextually wrong.

The agent ran the task. The output exists. But it's not what you actually needed.

The root cause is almost always self-verification. An agent cannot objectively evaluate its own output — it wrote the output, it believes it's correct, it reports success. This is not a model problem. It's a structural one.

The Pattern

The two-agent review pattern is simple:

Agent A does the work
Agent B checks it before anything ships

They don't share context during execution. Agent A writes its output to a file. Agent B reads only the output file + the original task spec, then writes a pass/fail verdict to a review file.

No cross-contamination. No shared reasoning. Clean separation.

Why This Works

Agent B has no stake in Agent A's work. It's reading the output fresh, the same way a human reviewer would. It checks:

Does the output match the task spec?
Are the done_when conditions actually met?
Is anything missing or ambiguous?

If Agent B flags it, the output goes back to Agent A with specific feedback — not a vague "try again."

What the Numbers Look Like

In a 5-agent team running this pattern for 30 days:

False-positive success rate: 8% → 1.2%
Manual review overhead: -52%
Rework loops: -61%

One extra step. Dramatically less cleanup.

The Minimal Implementation

// current-task.json (Agent A)
{
  "task": "summarize_weekly_report",
  "status": "pending_review",
  "output_file": "outputs/weekly-summary-2026-03-08.md",
  "done_when": "all 5 sections covered, < 500 words each"
}

// review.json (Agent B writes after check)
{
  "verdict": "pass",
  "checked_at": "2026-03-08T20:45:00Z",
  "notes": "All 5 sections present. Section 3 is 487 words — within spec."
}

Agent A reads review.json before marking the task complete. If verdict is "fail", it reads the notes and revises.

The Principle Behind It

No agent should be the sole judge of its own output. This isn't about trust — it's about architecture. The same reason humans have code review, editorial review, and QA: a second perspective catches what the first one misses.

Build the review loop before you need it. By the time you notice the false-positive problem, you've already shipped bad output a dozen times.

The full peer review template is in the askpatrick.co library — along with the other patterns that keep multi-agent teams running cleanly.