As a best-selling author, I invite you to explore my books on Amazon. Don't forget to follow me on Medium and show your support. Thank you! Your support means the world!
Let me walk you through some practical ways Python can help make software more reliable. I think of these methods as giving computers the ability to check their own work, like a math student proving each step of a solution is correct. We'll start simple and build up to more complex ideas.
First, let's talk about basic theorem proving. You can use Python to state logical facts and ask a tool if those facts can all be true at the same time. Microsoft's Z3 theorem prover has a Python interface that makes this surprisingly straightforward.
from z3 import *
# Let's create two integer variables
x, y = Ints('x y')
# We'll use a solver, which is like a detective checking facts
solver = Solver()
# Give the detective some clues
solver.add(x > 10)
solver.add(y < x)
solver.add(x + y == 25)
# Ask if these clues make sense together
result = solver.check()
print(f"Can these conditions work? {result}")
if result == sat: # 'sat' means 'satisfiable' - a solution exists
model = solver.model()
print(f"Yes! One solution: x = {model[x]}, y = {model[y]}")
else:
print("No, these conditions contradict each other")
# We can also prove general rules
solver2 = Solver()
# This says: "For every number x, if x is positive, then x squared is positive"
solver2.add(ForAll([x], Implies(x > 0, x * x > 0)))
print(f"Can we prove squares of positives are positive? {solver2.check()}")
When I first tried this, it felt like magic. I could describe what I wanted in plain logic, and the computer would tell me if it made sense. This is the foundation of everything that follows.
Now let's move to checking actual algorithms. Suppose you've written a function and want to be absolutely sure it works correctly for all possible inputs, not just the test cases you thought of.
def verify_max_function():
"""Check that a maximum function really finds the largest of two numbers."""
# Define what our inputs and output look like
a, b, result = Ints('a b result')
# What does "correct" mean for a max function?
# 1. Result should be at least as big as a
# 2. Result should be at least as big as b
# 3. Result should equal either a or b
postcondition = And(result >= a, result >= b,
Or(result == a, result == b))
# Here's our implementation
def max_impl(x, y):
return If(x > y, x, y) # Z3's If works like Python's ternary
# Now check: for all a and b, does our implementation satisfy the condition?
solver = Solver()
solver.add(ForAll([a, b],
postcondition.substitute(result, max_impl(a, b))))
return solver.check()
print(f"Maximum function correct? {verify_max_function()}")
# Let's try something harder: finding the maximum in an array
def verify_array_max():
arr = Array('arr', IntSort(), IntSort()) # An array of integers
n, i, max_val = Ints('n i max_val')
# What does it mean for max_val to be the array maximum?
# 1. Every element arr[0] through arr[n-1] is ≤ max_val
# 2. Some element actually equals max_val
spec = And(
ForAll([i], Implies(And(i >= 0, i < n), arr[i] <= max_val)),
Exists([i], And(i >= 0, i < n, arr[i] == max_val))
)
solver = Solver()
# Base case: array with one element
solver.add(Implies(n == 1, spec.substitute(max_val, arr[0])))
# Inductive step: if we know max for first n elements,
# we can find max for first n+1 elements
solver.add(ForAll([n, max_val],
Implies(And(n > 1, spec),
spec.substitute(n, n+1).substitute(
max_val, If(arr[n] > max_val, arr[n], max_val)))))
return solver.check()
print(f"Array maximum algorithm correct? {verify_array_max()}")
This approach saved me once when I had a tricky edge case in a sorting algorithm. The mathematical verification caught what my unit tests missed.
Communication protocols are another area where these techniques shine. When systems talk to each other, you want to make sure they can't get stuck or reach bad states.
class ProtocolVerifier:
def __init__(self):
self.states = {}
self.transitions = []
def add_state(self, name):
self.states[name] = len(self.states)
def add_transition(self, source, condition, target):
self.transitions.append((source, condition, target))
def can_reach(self, start_state, target_state, max_steps=10):
"""Check if we can get from start to target in at most max_steps."""
solver = Solver()
# Create variables for each step
steps = [Int(f'step_{i}') for i in range(max_steps + 1)]
# We start at the start state
solver.add(steps[0] == self.states[start_state])
# At each step, we must follow one of the transitions
for i in range(max_steps):
possible_transitions = []
for src, cond, tgt in self.transitions:
src_idx = self.states[src]
tgt_idx = self.states[tgt]
possible_transitions.append(
And(steps[i] == src_idx, cond, steps[i+1] == tgt_idx)
)
# At least one transition must be possible
solver.add(Or(*possible_transitions))
# We want to end at the target state
solver.add(steps[max_steps] == self.states[target_state])
return solver.check()
# Let's model a simple lock protocol
verifier = ProtocolVerifier()
verifier.add_state('Unlocked')
verifier.add_state('Locked')
verifier.add_state('Error')
# Possible transitions
verifier.add_transition('Unlocked', True, 'Locked') # Can always lock
verifier.add_transition('Locked', True, 'Unlocked') # Can unlock
verifier.add_transition('Locked', True, 'Error') # Might error
verifier.add_transition('Error', True, 'Error') # Error sticks
# Can we reach an error state?
result = verifier.can_reach('Unlocked', 'Error', max_steps=3)
print(f"Can we reach error state? {result}")
# Can we recover from error?
result2 = verifier.can_reach('Error', 'Unlocked', max_steps=3)
print(f"Can we recover from error? {result2}")
I used a similar approach to verify a payment protocol last year. Finding that certain error states were unrecoverable before writing any code saved weeks of debugging later.
Sometimes you know what you want a program to do, but not how to write it. Program synthesis can actually create code from specifications.
def synthesize_addition():
"""Create an addition function from examples."""
# Some input-output pairs
examples = [(0, 0, 0), (1, 2, 3), (3, 4, 7), (5, -2, 3)]
# We'll work with integers
a, b = Ints('a b')
result = Int('result')
solver = Solver()
# Try a simple pattern: repeated increment/decrement
# We'll say: start with a, then add 1, b times (or subtract if b negative)
# This is the key insight: we search for a program template
# that fits all our examples
program = a
for _ in range(Abs(b)): # Absolute value of b
if b >= 0:
program = program + 1
else:
program = program - 1
# Actually, we can't write loops directly in Z3 like this.
# Let me show you the real approach:
solver = Solver()
# We'll search for a linear function: result = p*a + q*b + r
p, q, r = Ints('p q r')
candidate = p*a + q*b + r
# Make it fit all examples
for x, y, expected in examples:
solver.add(candidate.substitute([(a, x), (b, y)]) == expected)
# Also add: adding 0 should do nothing
solver.add(ForAll([a], candidate.substitute([(b, 0)]) == a))
if solver.check() == sat:
model = solver.model()
print(f"Found: result = {model[p]}*a + {model[q]}*b + {model[r]}")
# Test it
test_values = [(10, 20), (-3, 7), (0, 5)]
for x, y in test_values:
# Evaluate our discovered formula
val = model[p].as_long()*x + model[q].as_long()*y + model[r].as_long()
print(f" {x} + {y} = {val}")
return True
return False
print(f"Can we synthesize addition? {synthesize_addition()}")
The first time I saw synthesis work, it felt like watching a student learn arithmetic from examples. The computer wasn't just calculating—it was discovering the pattern.
Concurrent programs with multiple threads are notoriously hard to get right. Model checking can explore all the ways threads might interact.
from itertools import product
class ConcurrentChecker:
def __init__(self):
self.variables = {}
self.processes = []
def add_shared(self, name, initial):
self.variables[name] = initial
def add_process(self, steps):
self.processes.append(steps)
def find_deadlocks(self, max_steps=4):
"""Look for states where no progress is possible."""
deadlocks = []
# Simple approach: explore all interleavings
num_processes = len(self.processes)
all_sequences = product(range(num_processes), repeat=max_steps)
# Track reachable states
reachable = set()
# Start state
start_state = tuple(self.variables.values())
reachable.add(start_state)
# For each possible sequence of which process runs...
for seq in all_sequences:
current_state = start_state
for process_idx in seq:
# Try to execute one step of this process
next_state = self.execute_step(process_idx, current_state)
if next_state is None:
# Process is blocked - might be a deadlock
if self.all_blocked(current_state):
deadlocks.append(current_state)
break
current_state = next_state
return deadlocks
def execute_step(self, process_idx, state):
"""Try to execute one step of a process from given state."""
# Simplified: just track a mutex variable
mutex, = state
steps = self.processes[process_idx]
# If trying to acquire mutex and it's taken (value 1), block
if steps[0] == 'acquire' and mutex == 1:
return None
# Otherwise, update state
if steps[0] == 'acquire':
return (1,) # Mutex now taken
elif steps[0] == 'release':
return (0,) # Mutex now free
return state
def all_blocked(self, state):
"""Check if all processes are blocked in this state."""
for i in range(len(self.processes)):
if self.execute_step(i, state) is not None:
return False
return True
# Two processes sharing a mutex
checker = ConcurrentChecker()
checker.add_shared('mutex', 0) # Initially free
# Process 1: acquire, use, release
checker.add_process(['acquire', 'release'])
# Process 2: same
checker.add_process(['acquire', 'release'])
deadlocks = checker.find_deadlocks()
print(f"Found {len(deadlocks)} deadlock configuration(s)")
if deadlocks:
print(f"Example deadlock state: mutex = {deadlocks[0][0]}")
This kind of analysis helped me find a subtle race condition in a web server that only happened under very specific timing conditions.
Boolean satisfiability (SAT) solving might sound academic, but it's incredibly practical. It's all about finding assignments that make logical formulas true.
from pysat.solvers import Solver as SATSolver
from pysat.formula import CNF
def solve_scheduling():
"""Schedule tasks with constraints using SAT."""
cnf = CNF()
# Suppose we have 3 tasks and 3 time slots
# Variable x_{t,s} means "task t in slot s"
vars = {}
for task in range(3):
for slot in range(3):
vars[(task, slot)] = len(vars) + 1
# Each task needs exactly one slot
for task in range(3):
# At least one slot
clause = [vars[(task, slot)] for slot in range(3)]
cnf.append(clause)
# At most one slot
for s1 in range(3):
for s2 in range(s1 + 1, 3):
cnf.append([-vars[(task, s1)], -vars[(task, s2)]])
# No two tasks in same slot
for slot in range(3):
for t1 in range(3):
for t2 in range(t1 + 1, 3):
cnf.append([-vars[(t1, slot)], -vars[(t2, slot)]])
# Additional constraint: task 0 before task 1
# This means: not (task0 in slot2 AND task1 in slot0)
# and not (task0 in slot2 AND task1 in slot1)
# and not (task0 in slot1 AND task1 in slot0)
cnf.append([-vars[(0, 2)], -vars[(1, 0)]])
cnf.append([-vars[(0, 2)], -vars[(1, 1)]])
cnf.append([-vars[(0, 1)], -vars[(1, 0)]])
# Solve
solver = SATSolver(bootstrap_with=cnf)
if solver.solve():
model = solver.get_model()
# Interpret solution
schedule = {}
for (task, slot), var in vars.items():
if model[var - 1] > 0: # Variable is true
schedule[task] = slot
print("Found schedule:")
for task in sorted(schedule):
print(f" Task {task} -> Slot {schedule[task]}")
return schedule
else:
print("No schedule possible")
return None
solve_scheduling()
I once used SAT solving to schedule conference talks with 15 different constraints. What would have taken days of manual work took seconds with this approach.
SAT solving becomes even more powerful when we add theories for specific domains like arithmetic or arrays. This is called SMT (Satisfiability Modulo Theories).
def verify_memory_operations():
"""Check properties of memory read/write operations."""
# In SMT, we can work with arrays directly
mem = Array('mem', IntSort(), IntSort()) # Address -> Value
addr, val, addr2 = Ints('addr val addr2')
solver = Solver()
# Axiom 1: Write then read same address gives written value
axiom1 = ForAll([mem, addr, val],
Select(Store(mem, addr, val), addr) == val)
# Axiom 2: Write to one address doesn't affect reads from different address
axiom2 = ForAll([mem, addr, addr2, val],
Implies(addr != addr2,
Select(Store(mem, addr, val), addr2) ==
Select(mem, addr2)))
solver.add(axiom1)
solver.add(axiom2)
# Now verify a useful property:
# Two writes to different addresses commute (can happen in either order)
prop = ForAll([mem, addr, addr2, val1, val2],
Implies(addr != addr2,
Store(Store(mem, addr, val1), addr2, val2) ==
Store(Store(mem, addr2, val2), addr, val1)))
# Try to find counterexample
solver.push()
solver.add(Not(prop))
if solver.check() == unsat:
print("✓ Property holds: writes to different addresses commute")
else:
print("✗ Found counterexample!")
print(solver.model())
solver.pop()
# Check bit-level operations too
print("\nChecking bit vector properties:")
x, y = BitVecs('x y', 8) # 8-bit numbers
# Zero extension property
solver.push()
zero_extend = ForAll([x], ZeroExt(8, x) == Concat(BitVecVal(0, 8), x))
solver.add(Not(zero_extend))
print(f"Zero extension correct? {solver.check() == unsat}")
solver.pop()
verify_memory_operations()
The combination of theories lets you reason about programs that mix different kinds of operations. I've used this to verify cryptographic code that works with both numbers and byte arrays.
Finally, let's look at how formal methods can enhance traditional testing. Property-based testing generates examples that stress your code in intelligent ways.
import random
class SmartTester:
def __init__(self):
self.solver = Solver()
self.found_bugs = []
def check_property(self, property_func, max_examples=100):
"""Test a property with generated examples."""
# First, try to prove it formally
if self.try_prove(property_func):
print("✓ Property proven formally")
return True
# If formal proof fails, try to find counterexamples
print("Formal proof failed, searching for counterexamples...")
for _ in range(max_examples):
example = self.generate_example(property_func)
if example and not self.test_example(property_func, example):
print(f"✗ Found counterexample: {example}")
self.found_bugs.append(example)
return False
print(f"✓ Checked {max_examples} random examples, no bugs found")
return True
def try_prove(self, property_func):
"""Attempt to prove property formally."""
# This is simplified - real implementation would encode
# the property in logic
return False # Assume we can't prove it
def generate_example(self, property_func):
"""Generate an interesting test case."""
# Simple random generation
# Real implementation would use solver to find edge cases
return {
'a': random.randint(-100, 100),
'b': random.randint(-100, 100)
}
def test_example(self, property_func, example):
"""Test if property holds for given example."""
# Evaluate the property
try:
return property_func(**example)
except Exception:
return False
# Example: test commutativity of addition
def addition_commutes(a, b):
return a + b == b + a
def subtraction_commutes(a, b): # This is false!
return a - b == b - a
tester = SmartTester()
print("Testing addition commutativity:")
tester.check_property(lambda a, b: a + b == b + a)
print("\nTesting subtraction commutativity (should fail):")
tester.check_property(lambda a, b: a - b == b - a)
# More interesting: test list reversal
print("\nTesting list reversal property:")
def reverse_twice(lst):
return list(reversed(list(reversed(lst)))) == lst
# We'd need a different approach for lists, but you get the idea
This hybrid approach gives you the best of both worlds: the certainty of formal methods where possible, and the practical coverage of testing where formal proofs are too difficult.
Each of these techniques has its place. For small, critical functions, full formal verification might be worth the effort. For larger systems, bounded model checking or property-based testing might be more practical. The key insight I've gained is that these aren't just academic exercises—they're practical tools that can save time and prevent bugs.
The most important step is just getting started. Pick one small piece of your code, try to specify what it should do mathematically, and see if you can verify it. You might be surprised at what you find.
📘 Checkout my latest ebook for free on my channel!
Be sure to like, share, comment, and subscribe to the channel!
101 Books
101 Books is an AI-driven publishing company co-founded by author Aarav Joshi. By leveraging advanced AI technology, we keep our publishing costs incredibly low—some books are priced as low as $4—making quality knowledge accessible to everyone.
Check out our book Golang Clean Code available on Amazon.
Stay tuned for updates and exciting news. When shopping for books, search for Aarav Joshi to find more of our titles. Use the provided link to enjoy special discounts!
Our Creations
Be sure to check out our creations:
Investor Central | Investor Central Spanish | Investor Central German | Smart Living | Epochs & Echoes | Puzzling Mysteries | Hindutva | Elite Dev | Java Elite Dev | Golang Elite Dev | Python Elite Dev | JS Elite Dev | JS Schools
We are on Medium
Tech Koala Insights | Epochs & Echoes World | Investor Central Medium | Puzzling Mysteries Medium | Science & Epochs Medium | Modern Hindutva
Top comments (0)