DEV Community

Cover image for Rust Formal Verification: Building Mathematical Proofs for Memory-Safe, Bug-Free Code Beyond Testing
Nithin Bharadwaj
Nithin Bharadwaj

Posted on

Rust Formal Verification: Building Mathematical Proofs for Memory-Safe, Bug-Free Code Beyond Testing

As a best-selling author, I invite you to explore my books on Amazon. Don't forget to follow me on Medium and show your support. Thank you! Your support means the world!

Let me tell you about a different way to think about the code we write. Most of us build software like we're crossing a river by jumping from stone to stone. We test each step, make sure it holds our weight, and move forward. It works, but we never know for certain if we missed a slippery stone that will fail us later. Formal verification is like building a bridge. You use mathematics to prove, before anyone steps on it, that every single part will hold under every possible condition. In Rust, a language built on the promise of safety, this isn't just an academic idea—it's a powerful extension of the compiler's own guarantees.

Think of it this way. Your unit tests are like a checklist. "Did it work for -1? Check. For 0? Check. For 42,873? Check." But you can't list every number. Fuzzing is like a friend who randomly shakes your program, trying to make it drop something. It's great at finding problems, but it can't tell you when it's finished or if everything is finally secure. Formal methods ask a more profound question: "Can I write down the rules of what correct means, and then prove that these rules hold true for every possible input, not just the ones I thought to try?"

This moves us from checking examples to proving properties. The difference is everything. In safety-critical systems—the code in an airplane's flight computer, the logic controlling a radiation therapy machine, or the cryptographic library securing a bank—"probably safe" isn't good enough. We need certainty. Rust gives us a head start with its type system and ownership rules, which eliminate whole classes of bugs at compile time. Formal verification tools build on that foundation, allowing us to prove things the compiler can't see on its own.

Let me show you what this looks like in practice. We'll start with a simple function. In normal Rust, we might write a safe addition function like this:

fn safe_add(x: u32, y: u32) -> Option<u32> {
    x.checked_add(y)
}
Enter fullscreen mode Exit fullscreen mode

This is good. It uses checked_add to avoid overflow, returning None if the sum is too large. But what if our specification says this function should only be called when we know for sure the addition won't overflow? And what if we want to guarantee that the result is indeed the mathematical sum? We can document this, but documentation can become outdated. With a tool like Creusot, we can encode this requirement directly into the code, and the tool will try to prove it.

use creusot_contracts::*;

#[requires(x@ + y@ <= u32::MAX@)]
#[ensures(result@ == x@ + y@)]
fn verified_add(x: u32, y: u32) -> u32 {
    x + y
}
Enter fullscreen mode Exit fullscreen mode

Here, #[requires] is a precondition. It states a fact that must be true when the function is called: the logical sum (x@) of the two values must not exceed the maximum a u32 can hold. The #[ensures] clause is a postcondition. It guarantees what will be true after the function runs: the result will be equal to that same logical sum. The @ symbol is Creusot's way of talking about the mathematical value of the variable, not the bits in memory. The tool takes this annotated code, generates a set of logical proof obligations, and uses a backend theorem prover to check them. If the prover succeeds, we have a mathematical proof that for any inputs satisfying the precondition, the function will satisfy the postcondition and will not panic from overflow.

This is a paradigm shift. The annotation isn't just a comment; it's part of the code's contract, and it's mechanically verified. It becomes executable, verifiable documentation.

Moving beyond arithmetic, we can prove properties about data structures. Let's consider a function that finds the maximum value in a slice. We want to prove two things: first, that if the slice isn't empty, the returned value is actually present in the slice. Second, that it is greater than or equal to every other element.

use creusot_contracts::*;

#[ensures(result == None ==> slice.is_empty())]
#[ensures(matches!(result, Some(max)) ==> {
    slice.contains(&max) &&
    forall(|i: usize| i < slice.len() ==> slice[i] @ <= max @)
})]
fn provable_max(slice: &[u32]) -> Option<u32> {
    slice.iter().max().copied()
}
Enter fullscreen mode Exit fullscreen mode

The postconditions here are more complex. The first one says: if the result is None, then the slice must be empty. The second uses a logical quantifier (forall): if the result is Some(max), then max must be in the slice, and for every index i within bounds, the element at that index is logically less than or equal to max. Writing this specification forces you to think precisely about what "maximum" means. The verification process might catch a subtle bug, like an off-by-one error in a hand-written loop, that a dozen unit tests could miss.

You might be thinking this looks heavy. For everyday code, it can be. The sweet spot for these tools is in the core, complex, or critical parts of your system—the pieces where a bug would be catastrophic, or the logic is so intricate you lie awake thinking about it. I don't verify my simple web server route handler this way. But I would absolutely use it on the custom allocator, the state machine for a network protocol, or the core algorithm at the heart of my application.

The ecosystem has different tools for different needs. Creusot is great for deep, deductive verification. But there's also Prusti, which is more accessible and integrates like a linter. Prusti often requires less annotation because it can infer many simple properties. Here's a loop invariant verified by Prusti, proving that a function correctly computes the sum of a vector:

use prusti_contracts::*;

#[ensures(result == vec.iter().sum())]
fn sum_vec(vec: &Vec<i32>) -> i32 {
    let mut sum = 0;
    let mut index = 0;
    while index < vec.len() {
        body_invariant!(index < vec.len());
        body_invariant!(sum == vec[..index].iter().sum());
        sum += vec[index];
        index += 1;
    }
    sum
}
Enter fullscreen mode Exit fullscreen mode

The body_invariant! macro is key. An invariant is something that is true every time the loop condition is evaluated. Here, we state two: first, index is always within bounds, proving memory safety. Second, the sum variable always holds the sum of the elements processed so far. Prusti uses these invariants to prove the final #[ensures] clause. Writing these invariants is the hardest part, but it's also where you gain the deepest understanding of your own algorithm.

Then there's Kani, from Amazon Web Services. Kani is a bit-model checker. Instead of using a theorem prover, it uses symbolic execution. It treats your code like an equation, exploring all possible execution paths for a given set of inputs (like all possible values of a u8). It's exceptionally good at finding subtle bugs in unsafe code, panics you forgot about, or unwraps on None values.

Imagine a function that is only valid for a certain range of inputs, guarded by a check:

fn process(value: u8) -> u32 {
    if value > 100 {
        panic!("Input out of valid range");
    }
    // ... complex computation ...
    (value as u32) * 10
}
Enter fullscreen mode Exit fullscreen mode

A unit test might not hit the panic. Fuzzing might, eventually. Kani can prove it definitively and even give you the exact failing input (101u8):

#[kani::proof]
fn verify_no_panic() {
    let value: u8 = kani::any();
    process(value);
}
Enter fullscreen mode Exit fullscreen mode

The #[kani::proof] annotation tells Kani to check all possible u8 values for value. It will symbolically execute the function 256 times and confirm whether the panic! is ever reached. For this example, it would find the bug immediately.

Working with these tools changes you as a programmer. It feels less like debugging and more like constructing an argument. You start by stating your assumptions clearly (the preconditions). You then define your goal with precision (the postconditions). The tool becomes a relentless critic, pointing out every gap in your logic, every missing edge case. When it finally accepts your proof, the feeling of confidence is profound. That loop is correct. That data structure invariant will hold. This isn't a hope based on test coverage; it's a conclusion based on logic.

The initial learning curve is real. You're learning a new way to think and often a bit of new syntax for specifications. The feedback loop is slower than cargo check. You'll spend time guiding the prover, breaking down a complex proof into smaller, provable steps. But the payoff is in the reduction of mental burden. For that critical kernel of code, you can move on, knowing it's settled.

This methodology is being used in the real world today. It's verifying blockchain virtual machines to prevent billion-dollar exploits. It's checking the state machines in secure communication protocols. It's in aviation and embedded systems. These tools bring the rigor of high-assurance engineering to the expressive, modern world of Rust.

In the end, it comes back to the bridge. Testing tells us the bridge held for the 50 trucks we drove across it yesterday. Formal verification gives us the engineering schematics and stress calculations that prove it will hold for every truck that ever could cross it, in every storm that might come. In Rust, we're not just building software; we're building infrastructure. For the most important parts of that infrastructure, proving correctness beyond testing isn't a luxury—it's the final, definitive step in delivering on Rust's fundamental promise of safety and reliability.

📘 Checkout my latest ebook for free on my channel!

Be sure to like, share, comment, and subscribe to the channel!


101 Books

101 Books is an AI-driven publishing company co-founded by author Aarav Joshi. By leveraging advanced AI technology, we keep our publishing costs incredibly low—some books are priced as low as $4—making quality knowledge accessible to everyone.

Check out our book Golang Clean Code available on Amazon.

Stay tuned for updates and exciting news. When shopping for books, search for Aarav Joshi to find more of our titles. Use the provided link to enjoy special discounts!

Our Creations

Be sure to check out our creations:

Investor Central | Investor Central Spanish | Investor Central German | Smart Living | Epochs & Echoes | Puzzling Mysteries | Hindutva | Elite Dev | Java Elite Dev | Golang Elite Dev | Python Elite Dev | JS Elite Dev | JS Schools


We are on Medium

Tech Koala Insights | Epochs & Echoes World | Investor Central Medium | Puzzling Mysteries Medium | Science & Epochs Medium | Modern Hindutva

Top comments (0)