<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: hekman316</title>
    <description>The latest articles on DEV Community by hekman316 (@hekman316).</description>
    <link>https://dev.to/hekman316</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3982334%2F1066da29-b98b-4ebc-9ce7-db2e8719d1ca.jpeg</url>
      <title>DEV Community: hekman316</title>
      <link>https://dev.to/hekman316</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/hekman316"/>
    <language>en</language>
    <item>
      <title>I stopped trusting Claude's code reviews, so I built a skill that puts my code on trial</title>
      <dc:creator>hekman316</dc:creator>
      <pubDate>Sat, 13 Jun 2026 07:32:35 +0000</pubDate>
      <link>https://dev.to/hekman316/i-stopped-trusting-claudes-code-reviews-so-i-built-a-skill-that-puts-my-code-on-trial-1ll0</link>
      <guid>https://dev.to/hekman316/i-stopped-trusting-claudes-code-reviews-so-i-built-a-skill-that-puts-my-code-on-trial-1ll0</guid>
      <description>&lt;p&gt;Every time I asked Claude to review my branch, I got one of two answers: a cheerful &lt;strong&gt;"Looks good! 👍"&lt;/strong&gt; or a vague list where I couldn't tell a real bug from a matter of taste. The model wants to please you. That's exactly the problem.&lt;/p&gt;

&lt;p&gt;So I built &lt;strong&gt;Tribunal&lt;/strong&gt; — a Claude skill that reviews your diff &lt;em&gt;adversarially&lt;/em&gt;, in stages, where the honest signal comes from agents fighting each other instead of one polite model.&lt;/p&gt;

&lt;h2&gt;
  
  
  The idea: don't ask one model to be fair
&lt;/h2&gt;

&lt;p&gt;A single model told to "be critical" still hedges — it's trained to be agreeable. So instead of one balanced reviewer, Tribunal runs &lt;strong&gt;one-sided roles that collide&lt;/strong&gt;:&lt;/p&gt;

&lt;h3&gt;
  
  
  🔥 1. Hater
&lt;/h3&gt;

&lt;p&gt;One agent per file, deliberately biased. It tears the diff apart as if a clueless amateur wrote it — focused only on what changed. But strictly on the merits: correctness, races, leaks, edge cases, security. No style nitpicks.&lt;/p&gt;

&lt;h3&gt;
  
  
  🔗 2. Integration
&lt;/h3&gt;

&lt;p&gt;Per-file haters are blind to cross-module bugs. A separate agent hunts exactly those: a changed function signature whose caller still calls the old way, a return shape a consumer no longer matches, invariants out of sync across files.&lt;/p&gt;

&lt;h3&gt;
  
  
  ⚖️ 3. Judge
&lt;/h3&gt;

&lt;p&gt;For each accusation, the judge digs into the actual code and decides honestly: was this &lt;strong&gt;deliberate and justified&lt;/strong&gt;, or &lt;strong&gt;genuinely weak&lt;/strong&gt;? It's allowed to use docs and comments as evidence of intent — the opposite of the hater, who ignores them as excuses.&lt;/p&gt;

&lt;h3&gt;
  
  
  📜 4. Verdict
&lt;/h3&gt;

&lt;p&gt;Keeps only the spots the judge &lt;strong&gt;couldn't defend&lt;/strong&gt; — or conceded are weak even while defending the choice. Everything else drops to a full transcript.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why the collision matters
&lt;/h2&gt;

&lt;p&gt;The balance doesn't live inside any single agent — it comes from the clash &lt;em&gt;between&lt;/em&gt; them. A hater that can &lt;strong&gt;only&lt;/strong&gt; attack, meeting a judge that &lt;strong&gt;only&lt;/strong&gt; looks for justification, produces a sharper, more honest signal than one model trying to be "balanced" on its own.&lt;/p&gt;

&lt;p&gt;And the hater is allowed to return nothing. On a clean diff it's not forced to invent problems — empty is a valid, honest result.&lt;/p&gt;

&lt;h2&gt;
  
  
  What you get
&lt;/h2&gt;

&lt;p&gt;A ranked report written to &lt;code&gt;docs/reviews/&lt;/code&gt;, plus a short chat summary: what to actually fix, by severity (critical → major → minor), with a concrete fix for each.&lt;/p&gt;

&lt;p&gt;It's &lt;strong&gt;portable&lt;/strong&gt; — pure Claude sub-agents (the &lt;code&gt;Agent&lt;/code&gt; tool), no external runtime, no dependencies. Works in &lt;strong&gt;Claude Code&lt;/strong&gt; and &lt;strong&gt;Claude Cowork&lt;/strong&gt;, in any language (Python, JS/TS, Go, Rust, Java… one config line to add yours).&lt;/p&gt;

&lt;h2&gt;
  
  
  Try it
&lt;/h2&gt;

&lt;p&gt;It's MIT and free: &lt;strong&gt;&lt;a href="https://github.com/hekman316/claude-skill-tribunal" rel="noopener noreferrer"&gt;https://github.com/hekman316/claude-skill-tribunal&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Install is one paste — ask Claude to fetch the &lt;code&gt;SKILL.md&lt;/code&gt; from the repo and drop it in &lt;code&gt;~/.claude/skills/&lt;/code&gt;. Then in any repo just say &lt;code&gt;/tribunal&lt;/code&gt;.&lt;/p&gt;




&lt;p&gt;I'm genuinely curious what people think of the adversarial-roles approach. Does forcing the model into one-sided roles actually beat just asking it to be harsh? Would love feedback — or attempts to break it.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>claude</category>
      <category>codereview</category>
      <category>opensource</category>
    </item>
  </channel>
</rss>
