- I don’t ask Claude to “build features”. I ask for diffs.
- I keep a local PR checklist in the repo. Enforced.
- I auto-generate a patch, run tests, then iterate on failures.
- You can copy my scripts: patch → typecheck → lint.
Context
I build small SaaS projects. Usually solo. Usually fast.
Vibe coding works. Until it doesn’t.
My failure mode was consistent: I’d paste a prompt, get 200 lines back, accept it, and then spend 3 hours chasing a dumb regression. Brutal.
So I changed the unit of work.
Not “a feature”. Not “a refactor”.
A diff.
If I can’t review it like a PR, it doesn’t ship. Even when I’m the only person on the repo.
Cursor helps because I can keep the conversation glued to the exact files. Claude helps because it can generate precise patches. But the workflow is the point.
1) I start with a PR checklist. In the repo.
If the checklist lives in my head, it doesn’t exist.
I keep mine in docs/pr-checklist.md. Then I reference it in every Cursor chat.
It’s boring. That’s why it works.
# PR checklist (local)
## Before code
- [ ] What’s the smallest diff that solves it?
- [ ] What’s the rollback plan? (git revert is fine)
- [ ] What’s the failure mode? (timeout, null, race, etc.)
## While coding
- [ ] No new deps unless unavoidable
- [ ] No "magic" env vars without docs
- [ ] Errors include context (ids, inputs)
## Before merge
- [ ] `npm run typecheck` passes
- [ ] `npm run lint` passes
- [ ] Added/updated tests (or wrote down why not)
- [ ] Ran the code path locally (screenshots optional, reality required)
I learned the hard way that “AI will remember constraints” is fake.
It won’t.
I won’t either, when I’m tired.
So I paste this into the prompt:
Follow docs/pr-checklist.md. Output a unified diff only.
No checklist. No code.
2) I force “diff-only” output. No essays.
Claude loves explaining. I get it.
But explanations don’t merge.
When I say “diff-only”, I mean it.
If it outputs anything else, I reject it and restate the constraint.
Also: I always request unified diff format.
Cursor applies it cleanly, and I can review it fast.
Here’s the exact prompt template I keep in a scratch file.
Not fancy. Just strict.
# prompts/diff-only.txt
You are editing a git repo.
Rules:
- Output ONLY a unified diff (```
diff ...
```).
- No prose. No bullets. No explanation.
- Touch the fewest files possible.
- Follow docs/pr-checklist.md.
Task:
Context:
Why this matters: it keeps me in review mode.
Not “wow it wrote code mode”.
And yeah, I still mess up.
I once accepted a nice-looking refactor that swapped zod parsing order and silently changed behavior. Spent 4 hours on this. Most of it was wrong.
Diff-first would’ve made it obvious.
3) I apply patches with a script. Then I run the gauntlet.
I don’t want manual steps.
Manual steps become “I’ll do it later”.
So I wrote a tiny patch runner.
It applies a diff file, then runs typecheck + lint + tests.
If anything fails, I feed the error back into Cursor + Claude.
This is Mac/Linux-friendly.
If you’re on Windows, WSL makes it normal.
#!/usr/bin/env bash
# scripts/apply_patch_and_check.sh
set -euo pipefail
PATCH_FILE="${1:-}"
if [[ -z "$PATCH_FILE" ]]; then
echo "Usage: scripts/apply_patch_and_check.sh path/to/patch.diff" >&2
exit 1
fi
# 1) Apply patch
# --whitespace=fix avoids dumb failures on trailing spaces
# --reject shows conflicts as .rej files instead of silently failing
git apply --whitespace=fix --reject "$PATCH_FILE"
# 2) Run the gauntlet
npm run typecheck
npm run lint
npm test
echo "OK: patch applied + checks passed"
My repo scripts are boring too:
{
"scripts": {
"typecheck": "tsc --noEmit",
"lint": "next lint",
"test": "vitest run"
}
}
When something fails, I don’t “fix it manually” immediately.
I paste the exact error into Cursor.
Then ask for a new diff that fixes only that.
Example error I hit last week:
TypeError: Cannot read properties of undefined (reading 'id')
at app/api/widgets/route.ts:41:23
That’s not a “write more code” problem.
That’s a “handle null input” problem.
Diff-only keeps that scoped.
4) I add guardrails in code, not in prompts
Prompts decay.
Code stays.
The simplest guardrail: validate inputs at boundaries.
In Next.js route handlers, I parse JSON safely and validate with Zod.
This prevents the classic AI bug: assuming fields exist.
// app/api/widgets/route.ts
import { NextResponse } from "next/server";
import { z } from "zod";
const WidgetCreate = z.object({
name: z.string().min(1),
size: z.number().int().min(1).max(1000)
});
export async function POST(req: Request) {
// Don't trust req.json() blindly. It throws.
let body: unknown;
try {
body = await req.json();
} catch {
return NextResponse.json({ error: "Invalid JSON" }, { status: 400 });
}
const parsed = WidgetCreate.safeParse(body);
if (!parsed.success) {
return NextResponse.json(
{ error: "Invalid payload", issues: parsed.error.issues },
{ status: 400 }
);
}
const { name, size } = parsed.data;
// pretend persistence here
return NextResponse.json({ id: crypto.randomUUID(), name, size }, { status: 201 });
}
Cursor + Claude can generate this.
But only if I demand it.
One thing that bit me — Claude sometimes “fixes” errors by widening types to any.
That’s not a fix.
That’s a mute button.
So I add one more constraint to my diff prompt:
Don’t add
any. Don’t disable lint rules.
And if it still does it?
I reject the diff.
5) I keep the loop tight: commit small, revert fast
Long AI sessions feel productive.
They’re also where bugs hide.
My rule: every accepted diff becomes a commit.
Even if it’s tiny.
Then if the next diff breaks something, I revert cleanly.
No archaeology.
Here’s the exact git flow I use when I’m iterating with Cursor:
# 1) Create a branch
git checkout -b fix/input-guards
# 2) After a patch passes checks
git add -A
git commit -m "Add Zod validation to widgets POST"
# 3) If the next patch goes sideways
git revert HEAD
# 4) Or if I want to trash everything on the branch
git reset --hard origin/main
Yeah, reset --hard has teeth.
That’s the point.
I used to avoid committing “ugly” intermediate steps.
Bad instinct.
In solo repos, commits are my safety rope.
Results
I shipped fewer “looks fine” bugs.
Not zero. But fewer.
Over the last 10 working days, I counted 27 AI-generated diffs that made it to main.
21 passed checks on the first apply.
6 failed typecheck or tests and needed a second diff.
Before this workflow, I’d routinely accept a big paste, then do 10+ manual edits to make it compile. That’s where I’d accidentally change behavior.
Now the failures are loud, early, and localized to one diff.
Key takeaways
- Treat AI output like a PR. Review a diff, not a story.
- Put your checklist in the repo, not your brain.
- Automate the “apply patch → run checks” loop. Remove willpower.
- Add guardrails in code (validation at boundaries). Prompts rot.
- Commit small so you can revert fast.
Closing
If you’re using Cursor + Claude too, try this for one week.
Diff-only prompts. Patch script. Small commits.
What’s the one check you always run before accepting an AI diff: typecheck, lint, unit tests, or a manual click-through of the UI?
Top comments (0)