I've reviewed hundreds of AI-generated pull requests. The same three bug categories slip through every time â not because the AI can't find them, but because nobody asks it to look.
Bug #1: State Mutations That Cross Function Boundaries
AI-generated code loves to mutate objects in place. It works in the function where it's written, but the caller doesn't expect the mutation.
// AI wrote this â looks fine in isolation
function enrichUser(user) {
user.displayName = `${user.firstName} ${user.lastName}`;
user.lastEnriched = Date.now();
return user;
}
// But the caller doesn't expect user to be mutated
const original = getUser(id);
const enriched = enrichUser(original);
// original.lastEnriched is now set â surprise!
Standard AI code reviews say "looks good." They check syntax, not mutation boundaries.
Bug #2: Error Handling That Swallows Context
AI assistants love try/catch blocks. But they consistently generate catches that lose the original error:
try:
result = process_data(payload)
except Exception:
return {"error": "Processing failed"} # Original exception? Gone.
The response looks correct. The error is "handled." But when this hits production and you need to debug, the actual exception is swallowed.
Bug #3: Implicit Type Coercion in Comparisons
This one is subtle and language-dependent. AI models generate comparisons that work with test data but fail with edge cases:
if (response.count == "0") { // Works when count is string "0"
// But fails silently when count is number 0
}
Most AI reviews won't flag this unless you explicitly ask about type coercion.
The Prompt That Catches All Three
After every AI-generated PR, I run this review prompt:
Review this code for these specific bug categories:
1. STATE MUTATIONS: Does any function modify its input arguments?
Flag every mutation of parameters, object properties, or
array contents. For each, state whether the caller expects
the mutation.
2. ERROR SWALLOWING: Does any try/catch or error handler
discard the original error message, stack trace, or error
type? Flag every catch block that doesn't preserve or
re-throw the original error.
3. TYPE COERCION: Are there any comparisons (==, !=, if(x))
that would behave differently with null, undefined, 0,
empty string, or NaN? Flag each one with the failing input.
For each issue found, show the exact line and a one-line fix.
Results
Running this on my last 20 PRs, it found:
- 8 state mutation bugs (3 would have caused production issues)
- 5 swallowed errors (all would have made debugging harder)
- 3 type coercion issues (1 was a genuine logic bug)
That's 16 bugs that passed standard AI review. The prompt takes 30 seconds to run and catches the patterns that generic "review this code" prompts consistently miss.
Why This Works
Generic review prompts let the AI decide what to look for. Specific bug-category prompts tell it exactly where to focus. The AI has the knowledge to find these bugs â it just needs the instruction to look.
What bug patterns do you see AI code reviews miss? I'm building a longer checklist â would love to add your patterns.
Top comments (0)