DEV Community

Sathish
Sathish

Posted on

Cursor + Claude: my AI code review checklist

  • I don’t “vibe code” blind. I run a checklist.
  • I make Claude review diffs, not ideas.
  • I automate checks: types, lint, tests, secret scan.
  • I ship fewer regressions. With receipts.

Context

I build small SaaS projects. Usually solo. Usually fast.

Cursor + Claude makes that speed possible. Also dangerous.

My first month with AI-assisted coding was… brutal. I shipped code that looked right, compiled right, and still broke auth flows because I forgot one edge case. Spent 4 hours chasing it. Most of it was wrong.

So I stopped asking AI to “build features”. I started using it like a very fast reviewer.

This post is my exact checklist. The thing I run before I merge. It’s boring. That’s the point.

1) I force Claude to review the diff. Not my vibes.

If I paste a whole file, Claude hallucinates context.

If I paste a diff, it behaves like a reviewer.

In Cursor, I select the git diff chunk. Then I ask one question:

“Review this diff. Find bugs. Find missing cases. Suggest tests. Don’t rewrite style.”

But I also give it structure. Otherwise it rambles.

Here’s the prompt template I keep in a snippet. I literally paste this.

You are reviewing a PR diff.

Rules:
- Focus on correctness, edge cases, security, and performance.
- Call out any behavior change.
- If you suggest a fix, show the minimal patch.
- Suggest at least 2 test cases.
- Don’t suggest refactors unless required for correctness.

Input:


Output format:
1) High-risk issues (with line refs)
2) Medium-risk issues
3) Tests I should add
4) Minimal patch (only if needed)
Enter fullscreen mode Exit fullscreen mode

One thing that bit me — Claude will “approve” code that fails typecheck if you don’t tell it you’re using TypeScript strict mode.

So I add one line when needed:

Project: TypeScript "strict": true. Runtime: Node 20.

That’s it. No poetry.

2) I run a local CI script. Every time.

I don’t trust myself to remember commands.

So I made one script. It’s my merge gate.

This is tuned for Next.js + TypeScript. Works fine for plain Node too.

Create scripts/ci-local.mjs:

#!/usr/bin/env node
import { execSync } from "node:child_process";

const run = (cmd) => {
  console.log(`\n$ ${cmd}`);
  execSync(cmd, { stdio: "inherit" });
};

try {
  // Fast fail first
  run("node -v");
  run("npm -v");

  // Deterministic install (CI-like)
  run("npm ci");

  // Quality gates
  run("npm run lint");
  run("npm run typecheck");
  run("npm test");

  console.log("\n✅ local CI passed");
} catch (e) {
  console.error("\n❌ local CI failed");
  process.exit(1);
}
Enter fullscreen mode Exit fullscreen mode

Then in package.json:

{
  "scripts": {
    "typecheck": "tsc -p tsconfig.json --noEmit",
    "ci:local": "node scripts/ci-local.mjs"
  }
}
Enter fullscreen mode Exit fullscreen mode

Now my flow is simple.

  1. Make changes with Cursor.
  2. Ask Claude to review the diff.
  3. Run npm run ci:local.

If it fails, I fix first. No new prompts. No scope creep.

And yeah, npm ci is slower. I still do it. I want the same pain CI will feel.

3) I make AI prove input/output behavior with tests

AI is great at writing code.

AI is also great at confidently changing behavior.

So I pin behavior with tiny tests. Always.

Even for “small” utility functions.

Here’s a real example pattern: normalize user input. Whitespace. Unicode. Case.

I’ve shipped bugs here. Twice.

src/lib/normalize.ts:

// Minimal, deterministic normalization.
// Keep it boring. Tests do the talking.
export function normalizeEmail(raw: string): string {
  return raw
    .trim()
    .toLowerCase()
    .normalize("NFKC");
}
Enter fullscreen mode Exit fullscreen mode

src/lib/normalize.test.ts (Vitest):

import { describe, expect, it } from "vitest";
import { normalizeEmail } from "./normalize";

describe("normalizeEmail", () => {
  it("trims and lowercases", () => {
    expect(normalizeEmail("  Foo@Example.Com  ")).toBe("foo@example.com");
  });

  it("normalizes unicode", () => {
    // Full-width Latin chars -> normal Latin (NFKC)
    expect(normalizeEmail("Foo@Example.Com")).toBe("foo@example.com");
  });
});
Enter fullscreen mode Exit fullscreen mode

Cursor + Claude writes these tests fast.

But I decide the cases.

That’s the trick. Don’t let AI pick what “correct” means.

If you don’t have tests set up, add Vitest:

npm i -D vitest and in package.json:

"test": "vitest run"

Then you’re not guessing anymore.

4) I scan for secrets because AI loves copying them

This one’s embarrassing.

I once pasted an .env value into chat. Then I copied code back. Then I almost committed it.

No drama. Just real life.

Now I run a secret scan locally. Same command every time.

I use gitleaks because it’s dead simple.

Install:

  • macOS: brew install gitleaks

Then run:

# Scan the repo (tracked + untracked)
gitleaks detect --source . --no-git --redact
Enter fullscreen mode Exit fullscreen mode

Want it automated? Add a script.

scripts/secret-scan.mjs:

#!/usr/bin/env node
import { execSync } from "node:child_process";

try {
  execSync("gitleaks version", { stdio: "inherit" });
} catch {
  console.error("gitleaks not found. Install it first: brew install gitleaks");
  process.exit(1);
}

try {
  execSync("gitleaks detect --source . --no-git --redact", { stdio: "inherit" });
  console.log("✅ secret scan passed");
} catch {
  console.error("❌ secret scan failed");
  process.exit(1);
}
Enter fullscreen mode Exit fullscreen mode

Then chain it into local CI:

run("node scripts/secret-scan.mjs");

This catches the dumb stuff.

The stuff you only notice after pushing.

5) I keep a “dumb log” so Claude stops repeating mistakes

Cursor chat resets. My brain resets too.

So I keep a file: NOTES.md.

Not docs. Not marketing. Just landmines.

Example entries from my real projects:

  • “Next.js route handlers: don’t return 200 with empty body. It becomes a silent bug.”
  • “Supabase RLS: always test with anon key + logged-in user. Both.”
  • “Zod refine: return boolean, not string. I lost 40 minutes.”

Then I feed the relevant lines to Claude when it’s about to touch that area.

It sounds silly.

It saves hours.

And it makes AI feel consistent. Like a teammate that remembers.

Results

Before I used this checklist, I’d merge 6–10 PRs a week and usually spend 2–3 hours per week debugging avoidable regressions. Stuff like “works locally, breaks in prod” or “edge case returns 200 with wrong payload”.

After I started doing diff-based reviews + ci:local + 2–4 focused tests per change, that dropped to about 20–40 minutes a week. Not zero. Never zero. But way less chaos.

Also: I’ve caught 3 secret leaks locally with gitleaks that absolutely would’ve landed in git.

Key takeaways

  • Make Claude review diffs. It reviews code, not stories.
  • One command to rule your local checks: lint, types, tests.
  • Write tests to freeze behavior. Especially input normalization.
  • Run a secret scan because AI will paste whatever you paste.
  • Keep a tiny landmine log (NOTES.md). Feed it back later.

Closing

Cursor + Claude makes me faster. It also makes me overconfident.

The checklist fixes that. Mostly.

If you already do AI-assisted coding: what’s your merge gate command, and what’s the one check you refuse to skip?

Top comments (0)