Verbose is a small experimental language I'm building. Its compiler proves properties about your code — like termination — and emits tiny, readable x86-64 machine code: no runtime, no GC, no libc. This post stands on its own (you don't need the rest of the series). What it's about: a compiler written in Verbose itself reached the point where a single rule emits a standalone Linux ELF that runs and prints its answer.
Honesty first: this lives on a working branch — not merged, not tagged. It's verified (green test suite, real in-memory execution, a measured benchmark), but it isn't in a released version, and it covers a sub-language (scalar arithmetic), not all of Verbose. More on that at the end.
Two halves of a compiler
A compiler has a half that reads (text → tokens → tree) and a half that writes (tree → machine code). The reading half is the usual story. The new part is the writing half — and the thing that makes it interesting here is that the code generator is itself a Verbose rule. The Rust compiler underneath (verbosec) only runs it; the machine code is emitted by rules written in the language.
Follow factorial all the way down
@proof termination decreasing : n
rule fact (n : number [0, 20]) -> number {
if n == 0 then 1 else n * fact(n - 1)
}
Note the guardrails that survive all the way to the bytes: @proof termination decreasing : n (the compiler requires proof the recursion ends) and [0, 20] (the declared domain of n). The generator walks this tree and emits, per node:
n == 0 cmp ; sete → boolean in a register
if … then … else … jz over_then ; <then> ; jmp end ; over_then: <else> ; end:
n * fact(n - 1) <compute fact(n-1)> ; imul
fact(n - 1) call <offset> → it calls itself
The jumps carry computed offsets: you only know a jump's distance once both branches are emitted, so the generator runs twice — measure, then write — the way you number a book's pages only after laying it out.
The milestone is that call/ret. When fact calls itself, it's a stack of plates you build up, then unwind while multiplying:
build up unwind (ret)
fact(3) n=3 3 × 2 = 6 ◄── final
└ fact(2) n=2 2 × 1 = 2
└ fact(1) n=1 1 × 1 = 1
└ fact(0) n=0 ─► 1 (base case)
This is the first time a rule written in Verbose emits machine code that references itself.
From a byte blob to a real executable
Bytes in a buffer don't run on their own — you have to wrap them so Linux will launch them. That wrapper is the ELF format, and the rule now generates the whole thing, like a ready-to-ship parcel:
ELF64 header ← "this is a Linux executable"
LOAD program header ← "map this into memory, r+x"
_start ← call main ; rax → itoa ; sys_write ; sys_exit
code for main, fact ← the computation itself
No sections, no symbols, no linked library. A few hundred bytes. The last brick added itoa (int → decimal ASCII) in machine code — without it the binary computed 120 but couldn't show it. Now:
$ ./verbosec examples/factorial.verbose --run lower factorial 5 > a.out
$ chmod +x a.out
$ ./a.out
120
Standalone — no verbosec needed to run the result.
The honest benchmark
The long-standing question — "what does a binary built this way actually buy versus C/Rust/Go?" — finally has a measured answer (Ubuntu 24.04, gcc 13.3, rustc 1.90, go 1.25.4):
trivial (print a constant, exit)
size wall(ms) RSS(kB) syscalls
verbose 512 0.19 352 2
gcc 706 584 0.24 664 17
rustc 3 871 984 0.57 2 112 62
go 2 254 335 1.28 2 132 228
fib(40) = 102334155 (deep recursion — raw compute)
size wall(ms) RSS(kB) syscalls
verbose 635 ~713 352 2
gcc 706 584 ~135 664 17
rustc 3 872 832 ~224 2 216 62
go 2 254 624 ~380 2 232 406
Read it both ways. Wins, by construction: ~1100–1380× smaller than gcc -static, ~6000× vs rustc; two syscalls vs 17–228; RSS down 2–6×. Because there's no runtime — nothing to initialize, no global allocator, no dynamic library. These are audit-defensible: there's almost nothing to audit. Loss, by design: on fib(40), Verbose is ~5× gcc's wall time. The emitted blob has seen no optimizer — no register allocation, no inlining, no constant folding past the AST. That was a decision (a direct, line-by-line auditable emitter, not an LLVM backend), and the compute cost is the price of that legibility — published, not hidden.
What isn't there yet
- It's a sub-language: scalar arithmetic (
+ − × ÷ %, comparisons,and/or,if/else,let, recursive calls). No strings, no I/O, no services, no TLS. - It's branch-only, not merged, not tagged.
- No optimization — deliberately deferred to keep every byte auditable.
The point isn't "Verbose beats gcc." It's "here's exactly what Verbose buys, and what it costs, measured." The read half and the write half of a self-hosting compiler just met one notch closer.
Full walk-through (in French) on arcker.org.
Top comments (0)