The Postmortem Prompt: Turn Bad Outputs Into Better Workflows (Template Included)

#tooling #productivity

If you use a coding assistant long enough, you’ll notice a pattern: the first output is rarely the one you ship.

Maybe it’s:

technically correct but stylistically wrong
missing an edge case you care about
too risky (touches too many files)
written in a way your team won’t maintain

Most people fix this by doing the same dance over and over: paste the output back in, complain in a sentence or two, and hope the next attempt is better.

There’s a faster way: treat “bad outputs” like incidents. Run a tiny postmortem, extract a reusable rule, and feed it back into your workflow.

I call this the Postmortem Prompt.

The idea

A postmortem isn’t about blame. It’s about turning a failure into:

a clear diagnosis (what went wrong?)
a preventative control (how do we stop it next time?)
a repeatable checklist (what should we verify before calling it “done”?)

When you do that for assistant outputs, you stop “arguing with the model” and start evolving your prompt system.

When to use it

Use the Postmortem Prompt whenever you catch yourself thinking:

“This is not what I meant… again.”
“It keeps rewriting the whole file.”
“It ignored our conventions.”
“It fixed the symptom, not the cause.”

If the failure is one-off, you can just correct it. If it’s repeatable, postmortem it.

The Postmortem Prompt (template)

Copy/paste this as a reusable snippet (I keep it in a prompts/ folder):

You are my workflow engineer.

We attempted a task and the output was not acceptable.

TASK (what I wanted):
<describe the intended outcome>

OUTPUT (what I got):
<paste the problematic output or summarize>

CONTEXT (constraints you should have followed):
- languages/frameworks:
- style/conventions:
- risk tolerance (small diffs vs rewrite):
- non-goals:

POSTMORTEM:
1) Identify the top 3 failure modes (be specific, no hand-waving).
2) For each failure mode, propose:
   a) a prompt rule (one sentence)
   b) a verification step (how we’ll check it)
3) Produce an improved prompt for the same task that bakes in those rules.
4) Produce a short “exit criteria” checklist (5-8 bullets) that I can run before accepting future outputs.

Format:
- Use headings.
- Keep rules imperative and testable.

The important part is: rules + verification + next prompt + exit criteria. That turns feedback into an asset.

Concrete example: “It rewrote half my codebase”

Let’s say you asked for a small change:

Add a --dry-run flag to our CLI tool so it prints actions instead of executing them.

And the assistant replies with a 600-line rewrite of your CLI, rearranging folders, changing logging, and “improving” names.

That’s not just annoying — it’s risky.

Postmortem (what went wrong)

Typical failure modes:

Scope inflation: it optimized unrelated parts.
Diff hostility: it didn’t preserve structure; you can’t review it quickly.
Unstated preferences: you care about minimal, reviewable diffs, but didn’t say it explicitly.

Rules you can extract

Turn that into hard rules:

Rule 1 (scope): Only modify code required to implement the requested flag. No refactors.
Rule 2 (diff): Prefer the smallest possible diff; preserve ordering and naming.
Rule 3 (review): Provide changes as a unified diff and list touched files.

And make them testable:

Verify: “Touched files ≤ 3 unless justified.”
Verify: “No renamed identifiers unless required.”
Verify: “A reviewer can understand the change by reading the diff only.”

Improved task prompt

Now your next prompt becomes better than the last:

Implement a --dry-run flag for our CLI.

Constraints:
- Minimal diff: do not refactor, rename, or move files.
- Only change what’s required for --dry-run.
- Preserve formatting and existing logging style.
- Output as a unified diff.

Behavior:
- When --dry-run is set, print the actions that would be taken.
- Do not execute side effects.

Before writing the diff:
- List which files you expect to touch and why.

Notice the shift: you’re no longer “asking nicely”; you’re declaring a review contract.

Concrete example: “It missed the edge case I care about”

Another common failure is happy-path bias.

You ask for an email parser, and it works — unless the sender has a plus-address, or the domain is internationalized, or a header is folded across lines.

Postmortem this too.

Extract rules like:

Always list edge cases before coding.
Write tests first for those edge cases.
State what you will not support (and fail loudly).

A tiny prompt addition can change the outcome dramatically:

Before implementing, enumerate 10 realistic edge cases.
Then write tests for the top 5.
Only then implement the code.

Where to store what you learn

If you run these postmortems in chat only, you’ll relearn the same lessons.

Instead, store the results like you would store engineering knowledge:

prompts/postmortem.md → the template
prompts/rules.md → your extracted rules (short, imperative)
prompts/checklists/ → exit criteria for common task types

Example prompts/rules.md snippet:

# Prompt Rules

## Diffs & risk
- Prefer minimal diffs over rewrites.
- Never change formatting unless requested.
- If a refactor is beneficial, propose it first; don’t do it implicitly.

## Tests
- If behavior changes, add tests.
- Name tests after user-visible behavior.

## Uncertainty
- If requirements are ambiguous, ask 1-3 questions before coding.