Nova Elvaris

Posted on Mar 26

Prompt Contracts v2: Add Validation Rules to Your AI Prompts (Template Included)

#programming #ai #productivity #promptengineering

My most-read post on this blog is about prompt contracts — treating prompts like API specs with defined inputs, outputs, and error handling.

But the original had a gap: it told you what to specify, not how to validate that the AI followed the spec.

This is the v2. It adds validation rules you can check automatically.

Quick Recap: What's a Prompt Contract?

A prompt contract is a structured prompt with three sections:

INPUT: what you're giving the AI
OUTPUT: what you expect back (format, structure, length)
ERROR: what to do when something doesn't match

If you haven't read the original, the core idea is: ambiguous prompts produce ambiguous outputs. Specs produce specs.

The Problem With v1

Prompt contracts work. But they rely on you manually checking whether the output matches. That's fine for one-off tasks. It breaks down when you're running prompts in scripts, pipelines, or daily workflows.

You need machine-checkable rules.

Validation Rules: The v2 Addition

Add a VALIDATION\ block to your contract:

INPUT:
- code_diff: string (unified diff format)
- language: "typescript" | "python" | "go"

OUTPUT:
- format: JSON
- schema: { "issues": [{ "line": int, "severity": "error"|"warning", "message": string }] }
- max_items: 10

VALIDATION:
- output must be valid JSON (parseable)
- every issue must reference a line number present in the diff
- severity must be one of the allowed values
- message must be under 200 characters
- if no issues found, return { "issues": [] } (not null, not empty string)

ERROR:
- if output fails validation, retry once with: "Your output failed validation: {error}. Fix and return only the corrected JSON."
- if retry also fails, return the raw output tagged as UNVALIDATED

Why This Matters

With validation rules, you can:

Auto-retry on structural failures (JSON parse errors, missing fields)
Log quality metrics (what % of outputs pass validation first-try?)
A/B test prompts with a consistent scoring mechanism
Chain prompts safely — downstream steps can trust the upstream output

A Real Example: Code Review Contract

Here's a complete contract I use for automated PR reviews:

# Code Review Contract v2.1

## INPUT
- diff: unified diff of the PR
- context: file-level summary (max 500 tokens)
- focus_areas: list of strings (e.g., ["security", "performance"])

## OUTPUT
- format: Markdown
- sections: Summary (2-3 sentences), Issues (bulleted list), Verdict (APPROVE / REQUEST_CHANGES)
- max_length: 800 words

## VALIDATION
- must contain exactly 3 sections with the headers above
- Issues section: each bullet must start with [ERROR], [WARNING], or [SUGGESTION]
- Verdict must be one of the two allowed values
- Summary must not exceed 3 sentences
- No line references to files not in the diff

## ERROR
- missing section → retry with "You're missing the {section} section"
- invalid verdict → retry with "Verdict must be APPROVE or REQUEST_CHANGES"
- 2 consecutive failures → flag for human review

The Validation Checklist Template

Copy this and adapt it for any prompt:

## VALIDATION
- [ ] Output format matches spec (JSON / Markdown / plain text)
- [ ] All required fields present
- [ ] Field types correct (strings are strings, numbers are numbers)
- [ ] Values within allowed ranges / enums
- [ ] Length constraints met (min/max words, items, characters)
- [ ] No hallucinated references (files, URLs, variables that don't exist in input)
- [ ] Consistent with input (doesn't contradict the given context)

Wiring It Up in Code

If you're calling an LLM from code, validation is just a function:

def validate_review(output: dict) -> list[str]:
    errors = []
    required = ["Summary", "Issues", "Verdict"]
    for section in required:
        if section not in output:
            errors.append(f"Missing section: {section}")
    if output.get("Verdict") not in ("APPROVE", "REQUEST_CHANGES"):
        errors.append(f"Invalid verdict: {output.get('Verdict')}")
    return errors

# Usage
result = call_llm(prompt)
issues = validate_review(result)
if issues:
    result = call_llm(f"Fix these validation errors: {issues}\n\nOriginal output: {result}")

Simple. No framework needed.

When to Skip Validation

Not everything needs a contract. If you're brainstorming, exploring, or doing creative work, contracts add friction without value.

Use contracts when:

Output feeds into another system
You're running the prompt more than once
Incorrect output has a real cost (time, money, bugs)

Skip contracts when:

You're thinking out loud
The output is for your eyes only
You'll edit it heavily anyway

Key Takeaway

Prompt contracts v1 told you to define the spec. v2 tells you to enforce it.

The validation block turns your prompt from a wish into a testable requirement. That's the difference between "AI-assisted" and "AI-reliable."

Grab the template above and try it on your most-used prompt. Share what breaks — I'm collecting failure modes for a follow-up post.

DEV Community