DEV Community

Nova
Nova

Posted on

Prompt Contracts v2: Add Validation Rules to Your AI Prompts (Template Included)

My most-read post on this blog is about prompt contracts — treating prompts like API specs with defined inputs, outputs, and error handling.

But the original had a gap: it told you what to specify, not how to validate that the AI followed the spec.

This is the v2. It adds validation rules you can check automatically.


Quick Recap: What's a Prompt Contract?

A prompt contract is a structured prompt with three sections:

INPUT: what you're giving the AI
OUTPUT: what you expect back (format, structure, length)
ERROR: what to do when something doesn't match
Enter fullscreen mode Exit fullscreen mode

If you haven't read the original, the core idea is: ambiguous prompts produce ambiguous outputs. Specs produce specs.


The Problem With v1

Prompt contracts work. But they rely on you manually checking whether the output matches. That's fine for one-off tasks. It breaks down when you're running prompts in scripts, pipelines, or daily workflows.

You need machine-checkable rules.


Validation Rules: The v2 Addition

Add a VALIDATION\ block to your contract:

INPUT:
- code_diff: string (unified diff format)
- language: "typescript" | "python" | "go"

OUTPUT:
- format: JSON
- schema: { "issues": [{ "line": int, "severity": "error"|"warning", "message": string }] }
- max_items: 10

VALIDATION:
- output must be valid JSON (parseable)
- every issue must reference a line number present in the diff
- severity must be one of the allowed values
- message must be under 200 characters
- if no issues found, return { "issues": [] } (not null, not empty string)

ERROR:
- if output fails validation, retry once with: "Your output failed validation: {error}. Fix and return only the corrected JSON."
- if retry also fails, return the raw output tagged as UNVALIDATED
Enter fullscreen mode Exit fullscreen mode

Why This Matters

With validation rules, you can:

  1. Auto-retry on structural failures (JSON parse errors, missing fields)
  2. Log quality metrics (what % of outputs pass validation first-try?)
  3. A/B test prompts with a consistent scoring mechanism
  4. Chain prompts safely — downstream steps can trust the upstream output

A Real Example: Code Review Contract

Here's a complete contract I use for automated PR reviews:

# Code Review Contract v2.1

## INPUT
- diff: unified diff of the PR
- context: file-level summary (max 500 tokens)
- focus_areas: list of strings (e.g., ["security", "performance"])

## OUTPUT
- format: Markdown
- sections: Summary (2-3 sentences), Issues (bulleted list), Verdict (APPROVE / REQUEST_CHANGES)
- max_length: 800 words

## VALIDATION
- must contain exactly 3 sections with the headers above
- Issues section: each bullet must start with [ERROR], [WARNING], or [SUGGESTION]
- Verdict must be one of the two allowed values
- Summary must not exceed 3 sentences
- No line references to files not in the diff

## ERROR
- missing section → retry with "You're missing the {section} section"
- invalid verdict → retry with "Verdict must be APPROVE or REQUEST_CHANGES"
- 2 consecutive failures → flag for human review
Enter fullscreen mode Exit fullscreen mode

The Validation Checklist Template

Copy this and adapt it for any prompt:

## VALIDATION
- [ ] Output format matches spec (JSON / Markdown / plain text)
- [ ] All required fields present
- [ ] Field types correct (strings are strings, numbers are numbers)
- [ ] Values within allowed ranges / enums
- [ ] Length constraints met (min/max words, items, characters)
- [ ] No hallucinated references (files, URLs, variables that don't exist in input)
- [ ] Consistent with input (doesn't contradict the given context)
Enter fullscreen mode Exit fullscreen mode

Wiring It Up in Code

If you're calling an LLM from code, validation is just a function:

def validate_review(output: dict) -> list[str]:
    errors = []
    required = ["Summary", "Issues", "Verdict"]
    for section in required:
        if section not in output:
            errors.append(f"Missing section: {section}")
    if output.get("Verdict") not in ("APPROVE", "REQUEST_CHANGES"):
        errors.append(f"Invalid verdict: {output.get('Verdict')}")
    return errors

# Usage
result = call_llm(prompt)
issues = validate_review(result)
if issues:
    result = call_llm(f"Fix these validation errors: {issues}\n\nOriginal output: {result}")
Enter fullscreen mode Exit fullscreen mode

Simple. No framework needed.


When to Skip Validation

Not everything needs a contract. If you're brainstorming, exploring, or doing creative work, contracts add friction without value.

Use contracts when:

  • Output feeds into another system
  • You're running the prompt more than once
  • Incorrect output has a real cost (time, money, bugs)

Skip contracts when:

  • You're thinking out loud
  • The output is for your eyes only
  • You'll edit it heavily anyway

Key Takeaway

Prompt contracts v1 told you to define the spec. v2 tells you to enforce it.

The validation block turns your prompt from a wish into a testable requirement. That's the difference between "AI-assisted" and "AI-reliable."


Grab the template above and try it on your most-used prompt. Share what breaks — I'm collecting failure modes for a follow-up post.

Top comments (0)