Most prompt failures on teams are not model failures. They're spec failures.
One person writes a prompt that returns JSON. Someone else adds a tone instruction. A third person pastes it into a different tool that expects markdown. By Friday, everyone is using "the same prompt" and getting different behavior.
That's why I like Prompt Contracts: a short spec for how a prompt is supposed to work. But a contract only helps if it includes the fields a team actually needs.
Here are the five fields I now consider mandatory.
1. Purpose
If a prompt contract can't answer "what job is this prompt for?" in one sentence, it's already too vague.
Bad:
Generate a useful response about the code.
Better:
Review a git diff and return actionable bug findings before merge.
That single sentence narrows scope. It also makes it obvious when someone tries to turn the same prompt into a style checker, architecture reviewer, and mentoring assistant all at once.
2. Inputs
Teams break prompts when they silently change the input shape.
Spell out what the prompt expects:
## Inputs
- `diff`: unified git diff, required
- `language`: programming language, optional
- `focus`: one of `bugs | security | maintainability`, default `bugs`
Now the downstream caller knows what to send, and a teammate can't casually add "also pass screenshots and ticket IDs" without updating the contract.
3. Output schema
This is the field most teams skip, and it's the one that causes the ugliest breakage.
If a prompt is part of a workflow, define the output format explicitly:
{
"summary": "string",
"findings": [
{
"severity": "high|medium|low",
"file": "string",
"issue": "string",
"suggestion": "string"
}
],
"blocking": true
}
Once you do this, two good things happen:
- humans know what a "correct" answer looks like
- scripts can validate output instead of hoping the model behaved
4. Error modes
A lot of prompts work on the happy path and fall apart the second the input is weird.
Add an error section:
## Error modes
- If `diff` is empty, return an empty findings array
- If input exceeds 2,000 lines, return `error: split_input`
- If required context is missing, ask exactly one clarifying question
This is the difference between a reliable prompt and a polite chaos machine.
5. Non-goals
I love this field because it stops "helpful" scope creep.
## Non-goals
- Do not rewrite working code
- Do not suggest dependency swaps
- Do not flag style issues when focus=bugs
Without non-goals, assistants tend to drift toward whatever they can comment on. Suddenly your bug review prompt is giving naming advice and suggesting framework migrations.
The compact template
Here's the version I actually use:
# Prompt Contract: Code Review
## Purpose
Review a diff and find merge-blocking issues.
## Inputs
- diff (required)
- language (optional)
- focus=bugs|security|maintainability
## Output schema
- summary
- findings[]
- blocking
## Error modes
- empty diff -> empty findings
- oversize diff -> split_input
- missing context -> ask one clarifying question
## Non-goals
- no rewrites
- no style nits unless requested
- no dependency advice
It fits on one page, which matters. If the contract turns into a mini novel, nobody reads it consistently, including the model.
Why this works better than prompt folklore
Teams often share prompts through Slack messages, Notion pages, or copied snippets in random repos. That's folklore, not a system.
A contract turns "the prompt Alex swears by" into a reusable asset that can be reviewed, versioned, and tested.
And once you start doing that, prompt quality stops depending on memory.
It starts depending on a file.
Question for you: If you had to add just one field to your current prompts today, would it be inputs, output schema, or non-goals?
Top comments (0)