Make your AI code review refuse to fake a PASS

#ai #opensource #productivity #claude

When you ask a coding agent to "review this change," you often get a confident PASS — one that skipped axes, cited no evidence, and never ran the tests. A green light you can't trust is worse than no review: it tells you to ship.

review-audit is a small Claude Code skill built around one rule: an axis is "audited" only when it can show evidence.

How it works

A single, read-only pass over your change across six axes:

correctness — boundary / null / type / timezone mistakes, with a reproducing input where possible
wiring (anti-Potemkin) — new code that's defined but never imported or called is dead, not done; it greps the call sites and counts references
security — secrets, command/SQL injection, unsafe eval or deserialize
test efficacy — tests that would pass even against an unimplemented stub (tautologies, no asserts, happy-path only)
spec compliance — when a spec exists, it checks the acceptance criteria
regression — it actually runs the tests/build and reports the exit code

For each axis it writes down how it checked: a file:line, a grep result, a command and its exit code. "I didn't check this" is a first-class output, not a silent gap. And an unexamined axis cannot be part of a PASS.

Read-only, by design

It proposes fixes but does not apply them. A before/after checksum confirms your tree was not touched — the audit can't quietly "fix" something and then call it reviewed.

Why evidence matters

The failure mode of AI review isn't being wrong — it's being plausibly right. "Looks fine" reads like a pass. So the skill won't PASS regression or wiring on "looks fine": regression needs a real run with an exit code; wiring needs concrete grep or file:line. No evidence, no PASS on that axis.

It runs in the calling agent's own context — no sub-agent fan-out — so it stays cheap enough to run on every change. When one pass genuinely isn't enough (a release gate, a high-risk change), it tells you, in its own output, to escalate.

Try it

git clone https://github.com/dualform-labs/review-audit.git
cp -r review-audit/skills/review-audit ~/.claude/skills/

Then give Claude Code a change that looks fine but has an unverified path — say, a function that's defined but never called — and run /review-audit. Watch it flag the wiring, list each axis as audited / partial / not-audited, and refuse to PASS what it didn't verify.

One prompt file, no dependencies, Apache-2.0. No network calls, no telemetry, no bypass-permissions.

Companion skill: spec — decide the build before the agent writes code.

— a dualform project