Most of the time I see someone complain that AI coding assistants "don't actually save time," the assistant isn't the problem. The prompt is.
I've been keeping a running list of the prompt failure modes that turn a 30-second Cursor / Claude Code / Copilot Chat interaction into a 20-minute back-and-forth. Eight have come up often enough to be worth writing down. None of them are about model choice. All of them are fixable in the prompt itself.
If you read this and recognise three or four of your own habits in here, you'll get more time back than from upgrading to a bigger model.
1. "Please debug this" with no failure mode
The canonical bad prompt. You paste 200 lines of code, type "please debug this," and the assistant hallucinates a problem because you didn't tell it which problem to look for.
A function with no obvious bug has infinite possible bugs. The model picks one that sounds plausible and you spend the next ten minutes arguing about a non-issue.
Fix: name the failure mode before pasting code.
This function returns the wrong total when the input list contains duplicates.
Expected: sum of unique values. Actual: counts duplicates twice.
Walk through the logic step by step and tell me which line introduces the bug.
Four extra sentences. Saves you the entire conversation.
2. Asking for "a code review" without a severity bar
If you say "review this code," you'll get a wall of stylistic nitpicks: "consider extracting this into a helper," "this variable could be more descriptive," "you might want a comment here."
None of it is wrong. None of it is what you wanted. You wanted to know whether you can ship.
Fix: force severity buckets and ban the rest.
Review this PR for issues at three severity levels:
Critical — will break in production (security, data loss, race conditions)
Major — will break under load or edge cases (performance, error handling)
Minor — worth fixing but won't block merge
Ignore style, naming, and "consider" suggestions. If you find nothing critical or major, say so explicitly.
The "say so explicitly" line is the magic. Without it, models will manufacture issues to justify the response.
3. Letting the model pick the architecture for you
You ask: "how should I structure auth in this app?"
The model picks one of five reasonable architectures, presents it as if it's the obvious answer, and you implement it. Three weeks later you realise it's wrong for your constraints because you never told the model your constraints.
Fix: make it compare, not recommend.
Compare three architectures for authentication in a Next.js app with Postgres:
1. JWT with refresh tokens, stored client-side
2. Server-side sessions with HTTP-only cookies
3. Auth provider (Clerk/Auth0)
For each, tell me:
- When it's the right call
- The failure mode I'll hit at scale
- Real cost (eng hours + $/month at 10k users)
Don't recommend one. I'll pick after I see the comparison.
Forcing comparison short-circuits the model's bias toward the most popular pattern.
4. Generating tests without specifying coverage
"Write tests for this function" produces three happy-path tests and zero edge cases. Then you ship, and a user discovers what happens when the input is an empty string.
Fix: name the categories.
Write unit tests for this function. Cover:
- Happy path (3 tests with realistic inputs)
- Edge cases (empty input, null, very large input, unicode)
- Error handling (each thrown error type, with the exact error message asserted)
- Boundary conditions (off-by-one on any numeric ranges)
Use Vitest. Each test name should describe the scenario, not the function.
The "describe the scenario, not the function" line stops you from getting eight tests all called test_calculateTotal_works.
5. Refactoring without an invariant
Asking the model to "clean up this code" is asking it to silently change behaviour. It will. You won't notice until production.
Fix: lock the contract.
Refactor this function for readability. Constraints:
- Public signature MUST NOT change
- Return values MUST be identical for all current inputs
- No new dependencies
- Show me the diff, not the whole file
- List any behavioural changes you made (there should be zero)
The last line is the guard. If the model lists any behavioural changes, you know to revert. If it lists none and you spot one in review, you know it lied — don't trust the rest of that session.
6. Asking for a CI/CD pipeline without naming the platform
"Write me a GitHub Actions workflow for deploying this app" — if you don't say what "this app" deploys to, you'll get a generic Node.js + AWS template that doesn't fit your stack and references services you don't use.
Fix: front-load the deployment target and constraints.
Write a GitHub Actions workflow for this app:
Stack: Next.js 15, deployed to Vercel
Database: Postgres on Neon, migrations via Drizzle
Tests: Vitest (unit) + Playwright (e2e)
Triggers: PR (lint + test only), main (full deploy)
Secrets available: VERCEL_TOKEN, DATABASE_URL, PLAYWRIGHT_TEST_BASE_URL
Caching: pnpm store + Playwright browsers
Do NOT include Docker, AWS, or any service not listed above.
The explicit "do NOT include" list is the most underrated technique in the whole article. Models default to including everything they've seen in similar configs. You have to tell them what to leave out.
7. Documentation generation without a target reader
"Write a README" produces a README aimed at no one in particular: half setup instructions, half marketing copy, no real architecture.
Fix: specify the reader and what they need to leave with.
Write a README for this repo. Reader: a senior engineer evaluating whether to use this library in production, who has 5 minutes.
They need to leave knowing:
1. What problem this solves (one paragraph)
2. What it does NOT do (bulleted list of out-of-scope cases)
3. Install + minimal working example (under 10 lines)
4. Production considerations (perf, error handling, observability)
5. Where to look for more (link map, not full docs)
No emoji. No badges. No "Why I built this."
8. Incident response prompts that ask for explanation instead of action
Production is on fire and you ask the model: "why is the API returning 502s?" You'll get a thoughtful three-paragraph essay on possible causes. You needed a checklist.
Fix: prompt for a diagnostic runbook, not an explanation.
Production symptom: API returns 502s intermittently, ~5% of requests, started 20 minutes ago.
Stack: Node.js + Express behind nginx, on EC2, Postgres RDS.
Give me a diagnostic runbook:
Step 1: What to check first (with the exact command/dashboard)
Step 2-N: Branching based on what step 1 returned
For each step: "if you see X, the cause is likely Y, fix is Z"
Do not explain causes I haven't asked about. Triage first, theorise later.
You'll get back something resembling an SRE runbook. That's the artifact you actually wanted.
The pattern under the patterns
Seven of the eight have the same root: the model isn't being told what "good" looks like, so it picks for you.
The fix isn't "prompt engineering" in the YouTube-thumbnail sense. It's just specifying:
- The failure mode you care about (debug, refactor)
- The bar for inclusion (severity, category, scope)
- What to leave out (the "do NOT" list)
- What artifact you want at the end (diff, runbook, comparison, README for X reader)
Do that on the first message and your second message becomes "thanks, applying it now" instead of "no, not like that, try again."
Going further
If you want a pre-built library of these structured prompts — organised by workflow (debug / review / architect / test / document / refactor / deploy / incident response), with the bracket-template format already filled in for each — the AI Coding Assistant Prompt Pack is what I use as my own starting point. 120+ prompts, $29 lifetime, works with Claude / ChatGPT / Copilot / Cursor.
But you don't need it. Take the eight patterns above, write your own versions for the workflows you hit weekly, and keep them in a prompts/ folder in your dotfiles. The pack is the shortcut, not the requirement. The skill is naming the failure mode out loud before you type the code in.
Top comments (0)