DEV Community

Nova Elvaris
Nova Elvaris

Posted on

I Reviewed 50 AI Pull Requests — Here Are the 7 Mistakes I Keep Seeing

After months of working with AI coding assistants — Claude, GPT, Copilot — I started noticing the same failure patterns in every pull request they generate. Not hallucinations or syntax errors. Subtler stuff. The kind of mistakes that pass CI but bite you at 2 AM.

Here are the 7 I see most often, with concrete examples and the fix for each.


1. The Phantom Import

AI adds an import for a library it assumes exists in your project.

# AI generated this
from utils.cache import RedisCache

# Your project has no utils/cache.py
Enter fullscreen mode Exit fullscreen mode

Fix: Always include a file tree or ls output in your prompt context. If the assistant doesn't know what exists, it'll invent things.

2. The Silent Default

Instead of raising an error, AI returns a "safe" default that hides bugs.

function getUser(id) {
  const user = db.find(id);
  return user || { name: "Unknown", role: "guest" }; // AI added this
}
Enter fullscreen mode Exit fullscreen mode

That Unknown user now has guest permissions and passes every downstream check. You won't notice until production.

Fix: Add a constraint to your prompt: "Throw errors on missing data. Never return placeholder objects."

3. The Confident Comment

AI writes comments that describe what the code should do, not what it actually does.

# Retry up to 3 times with exponential backoff
for i in range(5):  # ← wait, that's 5
    time.sleep(1)   # ← and that's linear, not exponential
Enter fullscreen mode Exit fullscreen mode

Fix: Tell your assistant: "Don't write comments unless the code is genuinely non-obvious. If you write a comment, it must match the implementation exactly."

4. The Test That Tests Nothing

AI generates a test file with proper structure, assertions, and green output — but the assertions are trivially true.

test("validates email", () => {
  const result = validateEmail("test@example.com");
  expect(result).toBeDefined(); // Always true if function exists
});
Enter fullscreen mode Exit fullscreen mode

Fix: Use the failing-test-first pattern. Ask the assistant to write a test that should fail with the current code, then write the implementation.

5. The Over-Abstraction

You ask for a simple function. AI gives you a factory-pattern-strategy-builder monstrosity.

// You wanted: parse a CSV line
// AI gave you: AbstractDelimiterParserFactory<T>
Enter fullscreen mode Exit fullscreen mode

Fix: Add a complexity constraint: "Use the simplest approach. No classes unless I ask for them. No abstractions beyond what the current requirements need."

6. The Scope Creep

You ask AI to fix a bug. It fixes the bug and refactors three adjacent functions, renames variables, and adds a feature you didn't request.

Fix: Be explicit: "Change only what's needed to fix [specific bug]. Don't refactor, rename, or improve anything else." I call this the Change Budget — give the assistant a scope boundary.

7. The Copy-Paste API Call

AI copies an API pattern from its training data without checking if the current SDK version matches.

# Works in openai 0.x
response = openai.ChatCompletion.create(...)

# Current SDK (1.x+) uses
response = client.chat.completions.create(...)
Enter fullscreen mode Exit fullscreen mode

Fix: Include your dependency versions in the prompt context: pip freeze | grep openai or your package.json lock versions. The assistant can only use what you tell it exists.


The Meta-Pattern

All seven mistakes share one root cause: the assistant is working with incomplete context and filling gaps with confident guesses.

The fix is always some version of: give it more context, give it tighter constraints, and verify the output against reality — not against vibes.

I keep a simple checklist taped next to my monitor:

  • [ ] Did I include the file tree?
  • [ ] Did I specify error handling behavior?
  • [ ] Did I set a scope boundary?
  • [ ] Did I check the test actually tests something?
  • [ ] Do the comments match the code?

It takes 30 seconds to check. It saves hours of debugging.


What patterns do you see in AI-generated PRs? Drop them in the comments — I'm collecting these for a longer guide.

Top comments (0)