Ask an LLM to "write a PRD for a notifications feature" and you get back something that looks finished: goals, user stories, acceptance criteria, a metrics section, even a rollout plan. It reads well. It is also, almost always, wrong in ways that surface three sprints later — the success metric measures the wrong thing, the edge cases the model invented don't match your actual users, and a non-goal you never agreed to has quietly become scope.
The failure isn't the model. It's treating PRD-writing as a generation task when it's actually a thinking task. The document is a byproduct of decisions you make about scope, tradeoffs, and what you're deliberately not building. If you hand those decisions to the model, you get a confident draft of a product nobody decided to build. We tested the workflow below across a dozen real feature specs to find where AI helps and where it has to stay out.
Where AI actually helps, and where it doesn't
Split the PRD into two kinds of content. The first is judgment: the problem statement, the success metric, the non-goals, the priority calls between competing user needs. The second is expansion: turning a decided scope into well-structured user stories, drafting acceptance criteria from a feature description, listing edge cases you might have missed, tightening prose.
AI is good at expansion and unreliable at judgment. When you ask it for judgment, it doesn't refuse — it guesses, and the guess inherits whatever assumptions were buried in your prompt. Ask "what's the success metric for this feature" and you'll get a plausible-sounding number like "increase notification open rate by 15%" that nobody validated against your retention model. That number then anchors every downstream conversation.
The rule we landed on: you write the first paragraph of every judgment section yourself, in one or two sentences, before the model touches it. The model expands and pressure-tests; it does not originate. A success metric you typed is a decision. A success metric the model typed is a suggestion you forgot to evaluate.
The most expensive PRD error is a fabricated success metric that survives review because it sounds reasonable. Reviewers skim metrics sections. A round number like "reduce churn by 10%" reads as researched even when it was generated from nothing. Type your own numbers, or write "TBD — needs data" and leave it visible.
A four-pass workflow
Instead of one prompt that produces the whole document, run four passes. Each pass has a narrow job, and you review between them so errors don't compound.
Pass 1 — Skeleton from your notes, not the model's imagination. Paste your raw thinking: the problem, who it's for, the rough scope, anything you've already ruled out. Ask the model to organize this into PRD section headers with your content slotted under each, and to flag every section where you gave it nothing. Those flags are your to-do list. Do not let it fill the gaps yet.
Pass 2 — Expansion, section by section. Take one decided section at a time. "Here's the scope I've committed to. Draft 4–6 user stories in the format As a [role], I want [capability], so that [outcome]." Working one section at a time keeps the model anchored to what you actually said instead of inventing a coherent-but-fictional whole.
Pass 3 — Adversarial review. Switch the model from author to critic. Prompt it explicitly: "You are a skeptical engineering lead. List every assumption in this PRD that isn't backed by stated evidence. For each, say what would falsify it." This is where AI earns its place — it's tireless at finding unstated assumptions, and it has no ego about the draft because it isn't defending its own reasoning.
Pass 4 — Consistency sweep. Ask it to check that the success metrics map to the goals, that every user story has acceptance criteria, and that nothing in the body contradicts the non-goals. Mechanical, boring, and exactly what a model does well.
Keep the adversarial-review prompt as a saved snippet. Run it on every PRD before you share it, including ones you wrote entirely by hand. The assumptions it surfaces in your own unaided writing are often the ones you were most blind to.
Keeping the plot: the non-goals section
The section that prevents the most drift is the one models are worst at: non-goals. An LLM optimizes for a complete, helpful-looking document, so it tends to expand scope — adding "nice to have" capabilities, downstream integrations, and v2 ideas that bleed into v1. Left unchecked, the PRD describes an ambitious product instead of the shippable slice you scoped.
Write your non-goals by hand and put them near the top, not buried at the bottom. Then, in your Pass 4 consistency sweep, explicitly ask: "Does anything in this document describe behavior that contradicts the non-goals?" Models are good at catching the contradiction once the constraint is written down — they're just bad at generating the constraint unprompted.
The same discipline applies to the problem statement. If you can't state the problem in two sentences without the model's help, you don't understand it well enough to spec it yet. Generating a polished problem statement from a vague prompt produces a document that sounds like it solves a clear problem while hiding that the problem was never defined. That's the precise mechanism by which teams lose the plot: the artifact looks decided, so nobody re-opens the decision.
A few habits make the whole loop hold together. Version the document and keep the model's suggested edits in suggestion mode, not applied directly, so a human approves every judgment-adjacent change. Keep a visible "TBD" marker for anything unvalidated rather than letting the model paper over it. And review between passes — the entire point of splitting generation into four steps is that you catch a wrong assumption in Pass 1 before it propagates into thirty user stories in Pass 2.
Used this way, AI cuts the mechanical time of PRD writing substantially — the user-story expansion and consistency checks are genuinely faster — while leaving the decisions where they belong. The document stays yours. The model just types faster than you do.
Originally published at pickuma.com. Subscribe to the RSS or follow @pickuma.bsky.social for new reviews.
Top comments (0)