"Review my code." You paste a function, and the AI replies with a tidy paragraph: "This code looks good! You might consider adding error handling and improving variable names." Thanks. You knew that. You wanted it to find the off-by-one in the loop and the unhandled null on line 14.
The problem isn't the model. It's the prompt. "Review my code" gives the AI no role, no priorities, and no output format, so it defaults to vague praise. The fix is to prompt it the way a senior engineer would brief a reviewer: who you are, what to look for, what to ignore, and how to report it.
Below are the principles plus a few of the exact prompts I use. Copy them, swap in your code, and watch the quality of the feedback change.
Why structure beats length
A good review prompt has four parts:
- Role + context — "You are a senior backend reviewer. This runs in production handling payments."
- What to check — a concrete list, in priority order.
- Constraints — what not to do. This is the part everyone skips, and it's the part that kills the fluff.
- Output format — so you get a triaged list, not an essay.
The constraint "Do not suggest improvements you cannot point to a specific line for" alone removes about half the generic noise. The model can no longer say "consider improving error handling" — it has to find the line where error handling is actually missing, or stay quiet.
Prompt 1: The triaged review
This is my default. The priority ordering and the "blocking first" rule mean the important stuff isn't buried under nitpicks.
You are a senior engineer reviewing a diff for production code.
Review the code I provide. Check, in this order:
1. Correctness — logic errors, off-by-one, wrong conditionals, edge cases (empty, null, very large input).
2. Security — injection, missing authz checks, unsafe deserialization, secrets in code.
3. Reliability — unhandled errors, resource leaks, race conditions, missing timeouts.
4. Clarity — naming and structure that will confuse the next reader.
Output three sections: BLOCKING, SHOULD-FIX, NITS.
For every item, quote the exact line and give the fix as a code snippet.
Constraints:
- If a category has no issues, write "none". Do not invent problems to seem thorough.
- Do not comment on style covered by a formatter.
- No general advice. Every point must reference a specific line.
Here is the code:
[PASTE CODE]
Prompt 2: The adversarial "break it" review
Sometimes you don't want a polite review — you want the AI to actively try to break your code. This catches edge cases a normal review glides past.
You are a hostile QA engineer. Your goal is to find inputs and conditions
that make this function behave incorrectly, crash, or produce wrong output.
For each issue, give:
- The exact input or condition that triggers it.
- What goes wrong and why.
- A minimal test that reproduces it.
Focus on: empty/null/undefined, boundary values, unexpected types, concurrent
calls, and extremely large input. Do not suggest stylistic changes.
Code:
[PASTE CODE]
The "give me a test that reproduces it" line is what makes this prompt pay off — you don't just hear about the bug, you get the failing test to confirm it.
Prompt 3: The security-only pass
When the code touches auth, payments, or user input, run a dedicated security pass instead of relying on the general review to catch it.
You are a security auditor. Review this code ONLY for security issues.
Map each finding to a category: injection, broken auth/authz, sensitive data
exposure, SSRF, insecure deserialization, or missing input validation.
For each finding: severity (high/med/low), the vulnerable line, an example
exploit, and the fix. If you find nothing exploitable, say so plainly —
do not pad the report.
Code:
[PASTE CODE]
Running security as its own pass works better than folding it into a general review, because a focused prompt keeps the model in one mode instead of splitting attention across naming, structure, and security all at once.
The pattern you can reuse for anything
Look at what those three share: a role, a prioritized checklist, a hard constraint against fluff, and a fixed output format. That same skeleton works for debugging ("you are a debugger, reproduce before theorizing"), for refactoring ("preserve behavior, change one thing at a time"), for test writing ("target edge cases, not the happy path"), and for SQL review ("check for N+1, missing indexes, full scans").
Once you internalize the skeleton, you stop typing "review my code" forever. You write prompts that tell the AI exactly what job it's doing — and it stops guessing.
A small homework
Take the triaged-review prompt above, run it on a function you wrote this week, then run plain "review my code" on the same function. The difference is the whole point of this post — and it's free to try right now.
Want these prompts wired into a real workflow?
Prompts get even better when they're not floating in a chat box but baked into named, repeatable steps. That's what I built with Claude Code Agent OS: it turns reviewer, debugger, test-writer, security-auditor and more into focused subagents you invoke with one command — each with the same five-part, anti-fluff structure (role, checklist, output format, constraints) as the prompts above, plus 12 slash commands and ready-to-edit CLAUDE.md templates. Works with Claude Code today, and the prompts themselves drop into ChatGPT, Cursor, or Copilot.
But the three prompts in this post are yours to keep and will sharpen your reviews today. Steal them, adapt them to your stack, and ship better code.
Which review constraint made the biggest difference for you? I'm always collecting the lines that kill AI fluff — share yours below.
Top comments (0)