Nova Elvaris

Posted on Mar 6

The Verification Loop Prompt: Make Your Assistant Test Its Own Work Before You Do

#webdev #devtips

If you use an assistant for coding or writing, you’ve probably seen this pattern:

You ask for something concrete.
You get something that looks right.
You spend the next 20 minutes discovering the hidden gotchas.

The fix isn’t “use a better model” or “be more specific” (though both help). The fix is to change the shape of the request.

I call it the Verification Loop Prompt: a two-phase prompt that forces the assistant to test and interrogate its own output before handing it to you.

This is not about perfection. It’s about catching the top 80% of failure modes—early, cheaply, and repeatably.

The idea

Most prompts ask for an answer.

The Verification Loop asks for an answer + evidence.

You’re not just requesting “do X”. You’re requesting:

Generate the solution.
Verify it against constraints, edge cases, and a small test plan.
Report what’s still uncertain.

That last step matters: a good verification loop doesn’t pretend certainty—it surfaces it.

The base prompt (copy/paste)

Use this as a default, then tweak the verification section depending on the task.

You are helping me with: <task>

Constraints:
- <constraint 1>
- <constraint 2>
- <constraint 3>

Deliverable:
- <what I want you to produce>

Verification loop (do this *after* you draft the deliverable):
1) List assumptions you made (bullet list).
2) Check the deliverable against each constraint (pass/fail + fix if fail).
3) Provide 5 edge cases / failure modes relevant to this task.
4) Propose a minimal test plan (steps I can actually run).
5) If anything is uncertain, flag it explicitly and ask up to 3 clarifying questions.

Only then output the final deliverable.

If you do nothing else, do this: ask for edge cases + a minimal test plan.

Example 1: AI-assisted coding (a tiny refactor with guardrails)

Say you’re refactoring a function that parses user input.

Without verification, you’ll often get a clean-looking refactor that quietly changes behavior.

With verification, you force the assistant to prove it understands the behavioral contract.

Prompt:

You are helping me with: refactoring a TypeScript function that parses a date input.

Constraints:
- Must preserve current behavior for valid ISO strings (YYYY-MM-DD).
- Must reject ambiguous formats (e.g., 03/04/05) instead of guessing.
- Must keep the same public function signature.
- Must include unit tests for at least 6 cases.

Deliverable:
- Updated function + Jest test cases.

Verification loop:
1) List assumptions about existing behavior.
2) Compare old vs new behavior for the 6 test cases (table).
3) Identify 5 tricky inputs that could break parsing.
4) Provide a minimal test plan: commands to run + expected output.

What you get back is dramatically different: not just code, but a mini spec.

The key is step (2): forcing an explicit old-vs-new comparison makes the assistant stop “beautifying” and start “preserving.”

A practical variant: the “compile + run” check

If the assistant has no tool access, it can still do a useful simulation:

“This should typecheck because …”
“This test asserts X, which corresponds to constraint Y.”

That’s not the same as actually running the tests, but it catches surprisingly many mistakes (missing imports, wrong API usage, mismatched types).

Example 2: Database changes (where mistakes hurt)

Schema changes are a perfect place for Verification Loops because the failure modes are well-known:

data loss
long locks
missing rollback
performance regressions

Prompt:

You are helping me with: writing a PostgreSQL migration to split a full_name column into first_name and last_name.

Constraints:
- No data loss.
- Must be reversible (down migration required).
- Must be safe for large tables (avoid long table locks).
- Must include a backfill strategy.

Deliverable:
- up.sql + down.sql + short rollout notes.

Verification loop:
1) List assumptions about existing data (nulls, formatting, edge cases).
2) Identify the riskiest operation in the migration.
3) Provide a rollback plan and what data cannot be perfectly recovered.
4) Give a minimal test plan using a temp table with sample rows.

This forces the assistant to confront reality: not all transformations are perfectly reversible.

If it can’t state what’s lossy, it probably hasn’t thought it through.

The “change budget” for verification (so it doesn’t spiral)

A common failure mode is overthinking: the assistant produces a deliverable, then writes a 40-point critique, then rewrites everything.

Fix that with a small budget:

Limit to 5 edge cases.
Limit to one revision pass.
Limit to 3 clarifying questions.

Add this line:

Verification budget: one pass only. If you find issues, fix them once and move on.

Verification should be a seatbelt, not a detour.

A lighter version for everyday tasks

For smaller asks (emails, docs, summaries), I use this quick loop:

Before you finalize:
- What did you assume?
- What’s the most likely mistake here?
- What would you check if you had 2 minutes?

It’s short, but it reliably catches:

missing context
mismatched tone
dropped requirements

Why it works (mechanically)

Assistants are great at producing plausible completions. The Verification Loop changes the objective:

From “finish the text”
To “finish the text and then evaluate it”

That second step triggers a different mode: comparison, constraint checking, and counterexample generation.

In human terms: you’re making the assistant review its own work like a teammate would.

A small checklist to keep

When you’re not sure how to structure the verification loop, pick 3 from this menu:

Assumptions (what did you infer?)
Constraint check (pass/fail)
Edge cases (counterexamples)
Test plan (minimal, runnable)
Rollback plan (for risky changes)
Confidence + uncertainty (what might be wrong?)

If you build the habit, you’ll notice something subtle: you spend less time “prompting harder” and more time shipping.

That’s the whole point.

DEV Community