Most “AI code review” prompts produce… vibes.
You paste a diff and you get:
- “Nice work!”
- “Consider naming this better”
- “Add comments”
Meanwhile the actual bug is a missing null check or an off-by-one in pagination.
Here’s a code review prompt that’s biased toward finding real defects, with a workflow you can reuse.
The mindset: reviewers don’t read, they hunt
Good reviewers scan for:
- incorrect assumptions at boundaries (nulls, empty arrays, timezones, encoding)
- security footguns (authZ, injection, SSRF, secrets)
- concurrency issues (double writes, retries, idempotency)
- performance cliffs (N+1, unbounded loops, missing indexes)
- test gaps (no failing test for the bug you think you fixed)
So we’re going to tell the model to do exactly that.
The prompt (copy/paste)
Use this as-is, then tweak the repo-specific checklist.
You are doing a production PR review as a senior engineer.
Goal: find defects and risky assumptions, not style nitpicks.
Context:
- Language/runtime: <e.g. Node 20 + TypeScript>
- Frameworks: <e.g. Express + Prisma>
- Critical paths: <payments, auth, PII, billing>
Input:
1) PR description:
<PASTE>
2) Diff (unified):
<PASTE>
Review rules:
- Prioritize correctness, security, reliability, and performance.
- If you’re unsure, ask a targeted question instead of guessing.
- Do NOT suggest renames unless it prevents a bug.
Output format:
A) Top 5 risks (most severe first)
B) Concrete review comments (bullet list). Each comment must include:
- what could break
- why
- a minimal fix suggestion
C) Test plan: 6-10 tests I should add or run
D) One “red team” scenario: how a malicious/chaotic user could break this
That output format matters. It forces the model to commit to why something is risky.
Make it even better: add a repo-specific checklist
Most bugs in a codebase are repetitive.
Add this after “Review rules”:
Repo-specific checks:
- Dates are always UTC internally; never parse locale strings.
- All external requests must have timeouts.
- Any write endpoint must be idempotent under retries.
- Avoid N+1 DB queries.
- Logs must not include PII.
You can build this list over time from your postmortems.
A practical example: review a pagination change
Let’s say a PR modifies a GET /users endpoint.
A “nice” review might say: “Looks good, add docs.”
A useful review hunts for:
- what happens when
limitis0or5000? - does
cursordecode safely? - can users enumerate other tenants?
- is ordering stable across pages?
- are we leaking PII fields?
If your prompt doesn’t steer the model toward these questions, it won’t reliably find them.
The two-pass trick (this is where it gets spicy)
Pass 1: defect hunter (prompt above)
Pass 2: minimal patch author
Given your review comments, propose a minimal follow-up diff.
Rules:
- Focus only on the top 1-2 risks.
- Keep the diff under ~120 lines.
- Include tests.
Output a unified diff.
This prevents the “let’s refactor the whole module” problem.
Tooling tip: make it easy to paste the right context
If you’re doing this often, create a tiny script that:
- grabs the PR diff
- grabs the PR description
- injects it into your template
Even a manual copy/paste workflow gets 80% of the benefit, but the consistency is what makes it a habit.
What to watch out for
- If you don’t give critical context (“this touches auth”), the model won’t prioritize it.
- If you paste too much, you’ll get shallow output. Prefer the diff + a 5-line description.
- If the model is uncertain, that’s a feature. Treat “unclear” as a cue to add a test or a guard.
If you want more prompts like this (debugging, spec-writing, refactors, test generation), I’m building a Prompt Engineering Cheatsheet at Nova Press.
Free sample: https://getnovapress.gumroad.com
Top comments (0)