I built a formal verifier for EVM contracts with no SMT solver

#ai #blockchain #webdev #computerscience

Most smart-contract bugs aren't exotic. They're an > that should have been a >=, an
invariant that quietly stops holding after one mutator, an off-by-one in accounting. Tests
catch these only if you happen to test the input that breaks them. I wanted something that
either proves a property holds for every input, or hands me a concrete
counterexample — and I wanted to build the proving core myself, with no Z3, no JavaSMT,
no external solver.

This is how DhrLang's contract prove works, and where I deliberately drew the line.

The contract you write

You annotate a function with what should always be true, and run the prover:

@contract
class Adder {
    @checked
    @ensures(result >= a)
    num add(num a, num b) {
        return a + b;
    }
}

dhrlang contract prove --bound=8 adder.dhr

Every obligation comes back as exactly one of three verdicts:

PROVED — true for all inputs (here it is: num parameters are non-negative 256-bit values, so a + b >= a).
REFUTED — false, with a reproducing counterexample.
UNKNOWN — the prover can't decide, and says so instead of guessing.

That third verdict is the whole ethic of the thing: it is sound. It never says PROVED
without a real proof, and never says REFUTED without a confirmed concrete failure.

The spec language

Three annotations, plus an arithmetic-mode marker:

@requires(...) — a precondition the caller must satisfy.
@ensures(...) — a postcondition over parameters and result.
@invariant(...) — a contract-wide property every mutator must preserve.

A guarded subtraction proves exactly:

@contract
class Sub {
    @requires(b <= a)
    @checked
    @ensures(result == a - b)
    num sub(num a, num b) { return a - b; }
}

And a contract invariant, checked against a state-mutating method (kaam is DhrLang's
keyword for a void method):

@invariant(total == a + b)
@contract
class Sum {
    @storage num a;
    @storage num b;
    @storage num total;
    @checked
    kaam set(num x, num y) {
        a = x; b = y; total = x + y;
    }
}

The prover symbolically runs set from the contract's genesis state and proves total == a + b still holds afterward — for all x, y.

How it works, part 1: symbolic execution

Each function is analyzed from the canonical post-construction state (storage scalars 0).
Parameters become fresh non-negative 256-bit symbols. Then the body is interpreted
symbolically:

straight-line code, local declarations/assignments, and scalar storage writes update a symbolic environment;
if / else forks the path, adding the branch condition (or its negation) as an assumption;
require / assert / revert either assume a condition onward or prune the path entirely.

At each non-reverting terminal path, the prover forms one Hoare-style obligation:

(requires  ∧  path-conditions  ∧  every atom ≥ 0)   ⇒   goal

How it works, part 2: keeping it linear (the fun part)

Terms are represented as an immutable linear form, constant + Σ coeffᵢ · atomᵢ. The
decision procedure only needs operations that keep a form linear: add, subtract, negate,
and multiply/divide by a constant.

So what about genuinely non-linear things — a * b, division, remainder, shifts, a mapping
read? Those get folded into a single opaque atom: an interned identifier standing for
"some non-negative 256-bit value I won't reason about further." The trick that makes this
pay off is interning: two structurally identical opaque sub-terms get the same key.
That means a goal like result == a * b is provable by pure linear reasoning — a * b
appears on both sides as the same opaque atom and cancels — even though the prover never
multiplies two unknowns. It reasons about the shape, not the value.

How it works, part 3: deciding the obligation

To decide hypotheses ⇒ goal, I check whether hypotheses ∧ ¬goal is unsatisfiable:

Negate the goal. A strict inequality is tightened using integrality — G > 0 becomes G ≥ 1 (i.e. G - 1 ≥ 0) — and an equality goal G == 0 splits into two independent refutations, G ≥ 1 and G ≤ -1, both of which must be impossible.
Build the system. Every hypothesis, the negated goal, and one atom ≥ 0 row per atom (the uint256 domain constraint) become rows of the form Σ coeffᵢ·atomᵢ + constant ≥ 0.
Eliminate. Project atoms out one at a time with Fourier–Motzkin — pairing each positive-coefficient row with each negative one (scaling by positive integers only, so no division). If the system ever collapses to a bare constant < 0, it's contradictory: the negation is impossible, so the spec is PROVED.

Why this is sound: the elimination decides satisfiability over the rationals, and a
system with no rational solution has no integer solution either, so rational-UNSAT ⇒
integer-UNSAT. The prover therefore can't wrongly declare PROVED — at worst it fails to
prove a true fact, which just becomes UNKNOWN.

Why checked arithmetic is load-bearing

The linear theory models +, -, * as mathematical integers. That's only sound if
overflow can't silently wrap. Under @checked, an overflowing operation reverts, so on
every non-reverting path the values equal their exact mathematical results and the proof
transfers to the EVM. Under @unchecked, proving is switched off and the obligation is
UNKNOWN:

@unchecked
@ensures(result == a + b)
num addU(num a, num b) { return a + b; }   // → UNKNOWN, by design

This is exactly why DhrLang v4.0.0 made checked arithmetic the default (matching
Solidity 0.8+): the safe semantics are also the provable ones.

When it can't prove: refutation with zero false positives

Change the spec to something false:

@checked
@ensures(result > a)     // false when b == 0
num add(num a, num b) { return a + b; }

The prover does a bounded concrete search and finds a=0, b=0. Crucially, it does not
report that on linear reasoning alone — it hands the candidate to DhrLang's existing
concrete spec-fuzzer (the L3 engine) and only reports REFUTED once that engine actually
executes the function and reproduces the violation. The fuzzer is the ground-truth oracle,
so there are zero false positives. A broken invariant (total = x + y + 1) is caught
the same way and reported as invariant violated.

What it deliberately does NOT do

This is the part I'd put first in any interview, because the honesty is the point:

Loops aren't modeled — a while makes the postcondition UNKNOWN.
Mapping aliasing, external calls, and nonlinear combinations the linear theory can't see all collapse to UNKNOWN.
It is experimental (L2b), not production, and it is not a Certora/Halmos competitor. Those are mature, full SMT-backed tools that verify real Solidity. This is a from-scratch core I built to understand symbolic execution and decision procedures from the metal up.

What it is: a small, sound verifier with no external solver, that turns a class of
real contract properties into a yes / counterexample / honest-shrug answer.

What I took away

Building this taught me the EVM execution model, design-by-contract, symbolic execution,
and just enough proof theory to know why the rationals make Fourier–Motzkin sound here. The
hardest and most satisfying decision was the discipline of UNKNOWN — a verifier that lies is
worse than no verifier, and most of the engineering went into not overclaiming.

Code: github.com/dhruv-15-03/DhrLang (src/main/java/dhrlang/proving). More of my work:
dhruvrastogi.me. I'm open to blockchain-infra / backend/tooling roles — happy to walk
through any of this.