wintrover

Posted on Apr 8 • Originally published at wintrover.github.io

The Most Dangerous Word in AI Coding: "Verified"

#axiom #formal #verification #smt

Got a "Verified" result from my formal verification engine.

Problem was, it was completely wrong.

The Setup

Looking at a simple function: checkType from Bitcoin Core.

The engine generated this SMT query:

(assert (= throwsRuntimeError (not (= typ expected))))
(assert (= typ expected))
(assert throwsRuntimeError)

At first glance? Looks fine.

But there's a fatal flaw in there.

The Contradiction

Unpack it and here's what you get:

Error occurs when typ != expected
But we're assuming typ == expected
While also asserting "an error occurred"

Boil it down:

typ == expected
AND simultaneously: typ != expected

Logically impossible.

What the Solver Did

Z3 (or any SMT solver) takes one look and concludes:

Unsat (Unsatisfiable)

In formal verification, this usually means:

"No execution path exists where the error occurs."

So the engine outputs:

✅ Verified

Where It Went Wrong

Here's the thing.

The solver didn't prove the code safe.

It proved the question itself was invalid.

Vacuous Truth

This is a classic trap in formal verification.

The definition is simple:

If the premise is impossible, the statement is always true.

Example:

"If the sun rises in the west, I am God."

Logically true.\
Because the premise can never hold.

What Actually Happened

The engine effectively asked:

"In a state where typ == expected, can an error occur?
Given errors only happen when typ != expected?"

The solver's answer is clear:

"No such state exists."

Then the engine interprets:

"No error possible → safe"

That jump is the problem.

Verified ≠ Correct

Key point:

"Verified" doesn't mean correct.

Sometimes it means:

Your model is broken.

This case was exactly that:

Actual logic wasn't captured
Contradictory constraints were generated
Solver just short-circuited

Didn't verify the code.\
Logic just crashed.

Why This Matters Now

We're in this flow:

AI writes code
AI writes tests
AI explains "correctness"

Problem:

None of that guarantees truth

Because:

AI optimizes for plausibility, not consistency

LLMs don't "resolve" contradictions.\
They continue them.

The Real Danger: False Confidence

This is worse than a failing test.

Test fails → problem visible
Verified (fake) → problem hidden

In security or finance logic? Catastrophic.

What's Needed

Formal verification systems need at minimum:

1. Sanity Check (Premise Validation)

Before proving anything:

Are these assumptions even possible together?

2. Vacuity Detection

Catch cases that pass because "nothing could happen."

3. Mutation Testing

Break the code intentionally:

If it still verifies, your verification is broken.

Axiom

That's why I'm building Axiom.

Not to replace AI.

To sit on top as a "verification layer."

Role is simple:

This holds
This doesn't
Or: this question is invalid from the start

Final Thought

If your system says "Verified,"

Ask this first:

Is the question itself valid?

If you can't answer that,

You don't have verification.

You have a well-crafted illusion.

DEV Community