I Stopped Writing PRDs. I Started Writing Prompts. Here's What Broke (and What Worked)

#ai #projectmanagement #vibecoding #buildinpublic

About six months ago I had a weird realization: the prompts I was writing for AI agents looked a lot like the PRDs I used to write for engineers. Not identical - but structurally, the job was the same. You're trying to encode intent in text so someone (or something) else can act on it without you in the room.

I started treating them the same way. It broke things. Then I fixed them. This is that story.

The experiment

I manage a team and run about a dozen AI agents in parallel - content, data analysis, competitive monitoring, etc. At some point I noticed I was spending 30-40 minutes per agent writing detailed setup docs. Background context, edge cases, behavior guidelines, failure modes.

It looked exactly like a PRD.

So I tried the obvious thing: started writing my PRDs the same way I write agent prompts. Ruthlessly specific. Present tense. Active voice. Outcomes stated, not features. No "we will investigate" - actual decisions.

The engineers actually liked it. Reviews got faster. Fewer "what does this mean?" Slack messages. One dev said it was the clearest spec he'd gotten in years.

But then I made the reverse mistake.

What broke when I went the other way

I started writing agent prompts like PRDs. Background sections. Stakeholder notes. "As a system that helps users..." framing.

The agents got worse.

Not dramatically worse - subtly worse. More hedging. More "I should clarify that..." type responses. The agents started behaving like junior PMs trying to look thorough rather than senior engineers shipping things.

Took me a few weeks to figure out what happened. PRDs are written about a system. Prompts are instructions to a system. Same words, completely different relationship.

A PRD says "the system should handle edge case X by doing Y." A prompt that works says "when X happens, do Y. if you're unsure, ask."

The framing changes how the model interprets authority. "Should" in a PRD is a requirement. "Should" in a prompt is a hedge.

The framework that clicked

I ended up with something I call spec layering. Three levels:

Level 1: Identity and authority
Who is this agent? What decisions can it make without asking? What's explicitly off-limits?

This is the hardest one to write. Most people skip it or make it vague. "Be helpful and accurate" isn't identity - it's noise. "Make engagement decisions autonomously. Escalate anything that touches pricing or partnerships" - that's authority.

Level 2: Operating procedures
The actual behavior spec. Step by step. Concrete. Written like you're explaining to a very competent new hire who doesn't yet know your specific context.

I borrowed from how we write runbooks for on-call engineers. If production breaks at 3am, the runbook can't assume the on-call has full context. It has to be self-contained.

Agent prompts are runbooks.

Level 3: Failure modes and recovery
What does the agent do when something unexpected happens? Default behavior in ambiguous situations?

This is where most agent prompts fall apart. You spec the happy path and leave edge cases to "use judgment." That's not a spec - that's hope.

Where this maps back to PM work

The thing I keep coming back to: writing a good prompt is the same skill as writing a good acceptance criterion.

Both are trying to pre-answer the question "is this done?" Both require you to be specific enough that two different people (or models) would give the same answer. Both fail the same way - too abstract, too much left to interpretation.

The difference is that with engineers you can have a conversation when the spec is unclear. With an agent you find out at runtime, usually at the worst possible moment.

So the bar is higher. You have to write the prompt as if you won't be available to answer questions. Because you won't.

What I actually changed

Practically, I now write agent prompts the way I write incident response playbooks:

Use numbered steps, not prose
State the success condition explicitly before the steps
Include a "when this breaks" section
Avoid passive voice and hedging language entirely
Never use "should" - use "do" or "don't"

For product specs I pulled the other direction - more concrete outcomes, less explanation of reasoning, explicit escalation criteria.

Both got better.

The uncomfortable implication

If writing a good prompt requires the same clarity as writing a good spec, then most product specs are not actually that good. The ambiguity we tolerate in PRDs gets papered over by smart engineers asking smart questions.

Remove the smart human intermediary and the ambiguity becomes a bug.

I think that's what a lot of the vibe coding discourse is actually about, underneath the hype. Not whether AI can write code - it clearly can. Whether the person directing it can write specs. Historically that's been optional. Increasingly it's not.

If you've tried treating prompts like specs (or vice versa) I'm curious what you found. The failure modes in particular - would love to compare notes.