We live in an era where AI writes code.
The problem is no longer productivity.
It's trust.
Code gets generated fast.
Tests pass.
But
there's no guarantee that this code is actually correct.
1) What We're Building: Axiom
Axiom is not a mere "automatic proof engine."
It's a system that makes AI-generated code trustworthy.
To achieve this, we combine:
- SMT Solver (Z3)
- Abstraction (CEGAR)
- Summary-based Decomposition (Summary)
- Formal Verification (Lean 4)
into a single orchestration.
2) The Incident: "42% Silence"
When running ax prove, the progress would stall precisely at 42%.
It looked like a simple UI glitch.
But three possibilities existed simultaneously:
- Actual deadlock
- Solver endlessly computing
- Some tasks finished but UI not showing them
In other words,
The system was in a state where you couldn't tell if it was dead or thinking.
3) Kimi's Diagnosis: "It's Actually Stuck"
Investigation revealed an actual bug.
Cause 1 — Sequential Result Waiting
let result = ^task # blocks in spawn order
→ If one hangs, everything stops
Cause 2 — Blocking I/O
readLine() # can wait forever on Z3 pipe
→ Deadlock possible regardless of solver state
4) First Fix: Make the System Non-Blocking
We fixed two things.
✔ Out-of-order processing
if isReady(task):
collect(task)
→ Immediately process completed tasks
✔ Non-blocking I/O
readData() # chunk-based processing
→ Detect solver termination/abnormal state immediately
This removed the "true freeze."
5) But a Bigger Problem Remained
After fixing the bug, the same question lingered.
"What if the solver never finds an answer?"
6) The Technical Debate: "Bypass or Siege?"
Two perspectives collided here.
🏛️ Gemini — Engineering Siege Strategy
"We're not finding answers, we're collecting evidence"
- Simplify problems with CEGAR
- Break them down with Summary
- Finalize with Lean 4
🏛️ Kimi — Limit-Awareness Design
"All these tools fail"
- SMT is undecidable
- CEGAR can also stall in loops
- Lean automation is ultimately a search problem
Conclusion:
There is no perfect solution.
7) Our Conclusion
The key insight from this debate:
❌ "A system that always proves" is impossible
✅ "A system that controls failure" is possible
8) Axiom's Design Principles
Axiom is not a "proof engine" but:
"A system that structures failure"
🏛️ L1/L2 — Stop the failure
- timeout
- abstraction (CEGAR)
→ Block infinite computation
🏛️ L3 — Break down the failure
- decomposition
- summary
→ Whole failure → Partial success
🏛️ L4 — Escalate the failure
- Lean 4
- Human invariant
→ Automation → Collaboration
9) Another Problem Surfaced
The bug was gone, but a new problem appeared.
It still "looked stuck."
This isn't just a UX problem.
Core Issue
In a verification system, "invisible progress" equals trust collapse.
10) Solution: Heartbeat Patch
We changed our approach.
Instead of "exact progress percentage,"
show "evidence of being alive"
Before
[████████░░] 42%
After
[████████░░] 42%
[VERIFIED] expr.nim::add
[VERIFIED] results.nim::isErr
[TIMEOUT] main.nim::getStr
The meaning of this change:
"It's not stuck — it's solving a hard problem."
11) Additional Improvement: Evidence-Based Hints
Z3 says nothing when it times out.
So we worked around it:
- Structural complexity analysis (Topology)
- Lean unsolved goal extraction
Result:
We can't directly say "why it failed,"
but we can show "where the problem is."
12) Lessons Learned
The incident clarified two things.
✔ Nim threadpool limitations
- FlowVar cancellation impossible
- Polling-based design required
✔ Z3's black-box nature
- timeout = unknown
- Root cause tracking impossible
→ Contextual diagnosis required
13) Conclusion
The "42% incident" wasn't just a simple bug.
It was the surfacing of structural limitations that every AI code verification system must face.
We made one choice here.
Abandon perfect proof
Choose controllable failure
One-sentence definition:
In a verification engine, what matters isn't the correct answer,
but how failure manifests and is controlled.
Closing
Axiom no longer freezes at 42%.
And more importantly:
Even when it stops,
it tells you why it stopped and what to do next.
One-line summary:
Axiom isn't an engine for proving code,
it's a system for generating trust.
Top comments (0)