<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Koki Riho</title>
    <description>The latest articles on DEV Community by Koki Riho (@rih0z).</description>
    <link>https://dev.to/rih0z</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3833225%2F463ccd8d-4dd5-4b24-8c71-fabbc0ea1957.jpeg</url>
      <title>DEV Community: Koki Riho</title>
      <link>https://dev.to/rih0z</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/rih0z"/>
    <language>en</language>
    <item>
      <title>Why AI Agent Outputs Need Adversarial Review (and How to Add It in One API Call)</title>
      <dc:creator>Koki Riho</dc:creator>
      <pubDate>Wed, 01 Apr 2026 20:33:00 +0000</pubDate>
      <link>https://dev.to/rih0z/why-ai-agent-outputs-need-adversarial-review-and-how-to-add-it-in-one-api-call-42ho</link>
      <guid>https://dev.to/rih0z/why-ai-agent-outputs-need-adversarial-review-and-how-to-add-it-in-one-api-call-42ho</guid>
      <description>&lt;h2&gt;
  
  
  The Problem: Agents Grading Their Own Homework
&lt;/h2&gt;

&lt;p&gt;If you’re running LLM agents in production — whether with LangChain, CrewAI, or custom pipelines — you’ve probably built some kind of output validation. Maybe a second LLM call checks the first one’s work. Maybe you parse for structural issues.&lt;/p&gt;

&lt;p&gt;Here’s what I kept finding: &lt;strong&gt;LLM-based self-review has a systematic leniency bias.&lt;/strong&gt; When you prompt an LLM to review output from another LLM (or itself), it overwhelmingly approves. The reviewer and generator share similar blind spots — they fail in correlated ways.&lt;/p&gt;

&lt;p&gt;This matters when your agent writes code that gets deployed, generates customer-facing content, or makes decisions affecting downstream systems.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Approach: Adversarial Review with Dual Consensus
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://agentdesk.usedevtools.com" rel="noopener noreferrer"&gt;AgentDesk&lt;/a&gt; provides two interfaces for adding adversarial review:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;MCP Server&lt;/strong&gt; (open source, MIT) — review-only. Pass in any content, get structured quality feedback. Runs locally with your own API key.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Hosted REST API&lt;/strong&gt; — generate + review + auto-fix in one call. Submit a prompt, get reviewed output back.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Both use the same core approach:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Two independent reviewers&lt;/strong&gt; evaluate the output, each prompted adversarially (their job is to find problems, not confirm quality)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Dual consensus&lt;/strong&gt; — both must agree on pass/fail&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Substantive quality validation&lt;/strong&gt; — a deterministic (non-LLM) layer that requires every checklist item to include specific evidence quoted from the output. Missing citations → automatic fail.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Scored 0-100&lt;/strong&gt; with structured feedback on every issue found&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;It’s BYOK — you supply your own LLM API key. AgentDesk handles orchestration only.&lt;/p&gt;

&lt;h2&gt;
  
  
  When to Use This
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;CI pipeline for generated code&lt;/strong&gt; — gate merges on review score&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Content QA for chatbot outputs&lt;/strong&gt; — catch hallucinations before they reach users&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Data extraction validation&lt;/strong&gt; — verify structured output completeness&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Multi-agent workflow checkpoints&lt;/strong&gt; — validate intermediate outputs between agents&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Option 1: MCP Server (Review Only)
&lt;/h2&gt;

&lt;p&gt;Install in one command and use from Claude Code, Claude Desktop, or any MCP client:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;claude mcp add agentdesk-mcp &lt;span class="nt"&gt;--&lt;/span&gt; npx @ezark-publish/agentdesk-mcp
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;review_output&lt;/code&gt; tool takes content you’ve already generated and reviews it:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"output"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Your agent generated content goes here..."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"review_type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"code"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"criteria"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Check for security vulnerabilities and race conditions"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Response:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"verdict"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"FAIL"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"score"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;38&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"issues"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"severity"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"critical"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"category"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"concurrency"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"description"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Race condition in refill logic under concurrent access"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"suggestion"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Use atomic CAS operation or mutex lock"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"checklist"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"item"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Thread safety"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"status"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"fail"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"evidence"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"lines 23-31: non-atomic read-modify-write on token count"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"summary"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Critical concurrency issues found. Not safe for production use."&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For higher confidence, &lt;code&gt;review_dual&lt;/code&gt; runs two independent reviewers + merge. Either reviewer can veto a pass.&lt;/p&gt;

&lt;h2&gt;
  
  
  Option 2: REST API (Generate + Review + Auto-Fix)
&lt;/h2&gt;

&lt;p&gt;For non-MCP integration, the hosted API generates output, reviews it, and auto-fixes (up to 2 iterations) in a single call:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-X&lt;/span&gt; POST https://agentdesk.usedevtools.com/api/v1/tasks &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Authorization: Bearer agd_your_key"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Content-Type: application/json"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s1"&gt;'{
    "prompt": "Write a token bucket rate limiter in Python",
    "api_key": "sk-ant-your-anthropic-key",
    "review": true,
    "review_type": "code",
    "dual_review": true
  }'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt;

&lt;span class="n"&gt;resp&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://agentdesk.usedevtools.com/api/v1/tasks&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;headers&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Authorization&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Bearer agd_your_key&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;prompt&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Write a token bucket rate limiter in Python&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;api_key&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;sk-ant-your-anthropic-key&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;review&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;review_type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;code&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;dual_review&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;task&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;resp&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;
&lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://agentdesk.usedevtools.com/api/v1/tasks/&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;task&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;id&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;headers&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Authorization&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Bearer agd_your_key&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;status&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;completed&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;failed&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;break&lt;/span&gt;
    &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sleep&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;review&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;verdict&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;  &lt;span class="c1"&gt;# PASS / FAIL / CONDITIONAL_PASS
&lt;/span&gt;&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;review&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;score&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;    &lt;span class="c1"&gt;# 0-100
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  How It Works Internally
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Reviewer A&lt;/strong&gt; gets the output with an adversarial system prompt — find every flaw: factual errors, logical gaps, missing requirements.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Reviewer B&lt;/strong&gt; independently evaluates from a different angle — completeness, edge cases, whether the output actually addresses the task.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Substantive quality check&lt;/strong&gt; (deterministic, not LLM-based): every checklist item must include a specific quote from the output as evidence. Missing citations → item force-failed. If &amp;gt;30% of items lack evidence, entire review capped at score 50.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Consensus engine&lt;/strong&gt; combines reviews. Both must pass. Scores averaged. Disagreements flagged.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Each reviewer gets a fresh API call with only the output to review and a distinct system prompt — no shared conversation history.&lt;/p&gt;

&lt;h2&gt;
  
  
  How This Compares
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Approach&lt;/th&gt;
&lt;th&gt;Correlation with generator&lt;/th&gt;
&lt;th&gt;Cost&lt;/th&gt;
&lt;th&gt;Runtime?&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Self-review (same model)&lt;/td&gt;
&lt;td&gt;High&lt;/td&gt;
&lt;td&gt;1 LLM call&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Chain-of-verification&lt;/td&gt;
&lt;td&gt;Medium&lt;/td&gt;
&lt;td&gt;2-3 LLM calls&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;AgentDesk adversarial&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Low&lt;/td&gt;
&lt;td&gt;2-3 LLM calls&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Offline eval (Braintrust, DeepEval)&lt;/td&gt;
&lt;td&gt;N/A&lt;/td&gt;
&lt;td&gt;Varies&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Human review&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;$$$ + slow&lt;/td&gt;
&lt;td&gt;Partially&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  Pricing
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Free&lt;/strong&gt;: 20 tasks/month (BYOK)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Starter&lt;/strong&gt;: $29/mo — 500 tasks&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Pro&lt;/strong&gt;: $79/mo — 5,000 tasks + dual review + workflows&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Team&lt;/strong&gt;: $199/mo — 50,000 tasks&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The MCP server is free and open source — you only pay for your own LLM API calls.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Get started in 30 seconds:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;MCP: &lt;code&gt;claude mcp add agentdesk-mcp -- npx @ezark-publish/agentdesk-mcp&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;REST API: &lt;a href="https://agentdesk.usedevtools.com" rel="noopener noreferrer"&gt;Sign up for a free API key&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;GitHub: &lt;a href="https://github.com/Rih0z/agentdesk-mcp" rel="noopener noreferrer"&gt;github.com/Rih0z/agentdesk-mcp&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;If you’re building with AI agents, I’d like to hear what’s working for you on quality control. Drop a comment or &lt;a href="https://github.com/Rih0z/agentdesk-mcp/issues" rel="noopener noreferrer"&gt;open an issue on GitHub&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>llm</category>
      <category>agents</category>
      <category>testing</category>
    </item>
    <item>
      <title>Why AI Agent Outputs Need Adversarial Review (and How to Add It in One API Call)</title>
      <dc:creator>Koki Riho</dc:creator>
      <pubDate>Thu, 19 Mar 2026 06:36:55 +0000</pubDate>
      <link>https://dev.to/rih0z/why-ai-agent-outputs-need-adversarial-review-and-how-to-add-it-in-one-api-call-1l92</link>
      <guid>https://dev.to/rih0z/why-ai-agent-outputs-need-adversarial-review-and-how-to-add-it-in-one-api-call-1l92</guid>
      <description>&lt;h2&gt;
  
  
  The Problem: Agents Grading Their Own Homework
&lt;/h2&gt;

&lt;p&gt;If you're running LLM agents in production, you've probably built some kind of output validation. Maybe a second LLM call checks the first one's work. Maybe you parse for structural issues.&lt;/p&gt;

&lt;p&gt;Here's what I kept finding: &lt;strong&gt;LLM-based self-review has a systematic leniency bias.&lt;/strong&gt; When you prompt an LLM to review output from another LLM (or itself), it overwhelmingly approves. The reviewer and generator share similar blind spots — they fail in correlated ways.&lt;/p&gt;

&lt;p&gt;This matters when your agent writes code that gets deployed, generates customer-facing content, or makes decisions affecting downstream systems.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Approach: Adversarial Review with Dual Consensus
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://agentdesk-blue.vercel.app" rel="noopener noreferrer"&gt;AgentDesk&lt;/a&gt; is an HTTP API that sits between your agent's output and whatever consumes it:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Two independent reviewers&lt;/strong&gt; evaluate the output, each prompted adversarially (their job is to find problems, not confirm quality)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Dual consensus&lt;/strong&gt; — both must agree on pass/fail&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Anti-gaming validation&lt;/strong&gt; — detects outputs that are superficially correct but substantively hollow&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Scored 0-100&lt;/strong&gt; with structured feedback on every issue found&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;It's BYOK — you supply your own LLM API key. AgentDesk charges only for orchestration.&lt;/p&gt;

&lt;h2&gt;
  
  
  Adding It to Your Pipeline
&lt;/h2&gt;

&lt;h3&gt;
  
  
  curl
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-X&lt;/span&gt; POST https://agentdesk-blue.vercel.app/api/v1/tasks &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Authorization: Bearer agd_your_key"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Content-Type: application/json"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s1"&gt;'{
    "prompt": "Summarize this quarterly report in 3 bullet points",
    "review": true,
    "review_type": "content"
  }'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Python
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt;

&lt;span class="n"&gt;resp&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://agentdesk-blue.vercel.app/api/v1/tasks&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;headers&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Authorization&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Bearer agd_your_key&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;prompt&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Summarize this quarterly report in 3 bullet points&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;review&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;review_type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;task&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;resp&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="c1"&gt;# Poll GET /api/v1/tasks/{task['id']} for results
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  JavaScript
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;resp&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;fetch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;https://agentdesk-blue.vercel.app/api/v1/tasks&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;method&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;POST&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Authorization&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Bearer agd_your_key&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Content-Type&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;application/json&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="na"&gt;body&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stringify&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Summarize this quarterly report in 3 bullet points&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;review&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;review_type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;content&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="p"&gt;}),&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;task&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;resp&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="c1"&gt;// Poll GET /api/v1/tasks/${task.id} for results&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  How It Works Internally
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Reviewer A&lt;/strong&gt; gets the output with an adversarial system prompt — find every flaw: factual errors, logical gaps, missing requirements.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Reviewer B&lt;/strong&gt; independently evaluates from a different angle — completeness, edge cases, whether the output actually addresses the task.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Anti-gaming check&lt;/strong&gt; detects outputs designed to pass review without being good — verbose empty answers, pattern-matched boilerplate.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Consensus engine&lt;/strong&gt; combines reviews. Both must pass. Scores averaged. Disagreements flagged.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Completes in a single API call, typically 3-8 seconds.&lt;/p&gt;

&lt;h2&gt;
  
  
  How This Compares
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Approach&lt;/th&gt;
&lt;th&gt;Correlation with generator&lt;/th&gt;
&lt;th&gt;Cost&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Self-review (same model)&lt;/td&gt;
&lt;td&gt;High&lt;/td&gt;
&lt;td&gt;1 LLM call&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Chain-of-verification&lt;/td&gt;
&lt;td&gt;Medium&lt;/td&gt;
&lt;td&gt;2-3 LLM calls&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;AgentDesk adversarial&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Low&lt;/td&gt;
&lt;td&gt;2-3 LLM calls&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Human review&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;$$$ + slow&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The key difference isn't the number of LLM calls — it's that adversarial prompting with anti-gaming breaks the correlation between generator and reviewer failure modes.&lt;/p&gt;

&lt;h2&gt;
  
  
  Pricing
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Free&lt;/strong&gt;: 20 tasks/month (BYOK)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Starter&lt;/strong&gt;: $29/mo — 500 tasks&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Pro&lt;/strong&gt;: $79/mo — 5,000 tasks + dual review + workflows&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Team&lt;/strong&gt;: $199/mo — 50,000 tasks&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Open source: &lt;a href="https://github.com/Rih0z/agentdesk" rel="noopener noreferrer"&gt;github.com/Rih0z/agentdesk&lt;/a&gt;&lt;/p&gt;




&lt;p&gt;If you're building with AI agents, I'd like to hear what's working for you on quality control. Drop a comment or open an issue on GitHub.&lt;/p&gt;

</description>
      <category>agents</category>
      <category>ai</category>
      <category>llm</category>
      <category>testing</category>
    </item>
  </channel>
</rss>
