DEV Community

Nicolas Mamán
Nicolas Mamán

Posted on

Why My Programming Language Compiles to Readable C (Not LLVM IR)

I've been solo-building a programming language called Aether. It has actors, pattern matching, type inference - the usual wish list for a modern systems language. But the decision that defined the whole project was this: compile to C, not LLVM IR.

Here's why, and what I gained and lost.

The generated code is debuggable

When Aether compiles your program, it produces a .c file you can open and read. Function names map to your source. Structs map to your types. If something crashes, gdb points at code that makes sense.

With LLVM IR, you get a wall of SSA instructions. For a solo developer building a language from scratch, readable output was non-negotiable - I needed to see what my compiler was actually doing.

C interop comes free

Since the output is just C, calling into existing C libraries is trivial. Use the extern keyword and call it. No FFI bindings, no marshaling. The generated code links against your .h files directly.

This turned out to be one of the biggest practical advantages. The entire C ecosystem is immediately available.

The tradeoff: no LLVM optimization passes

This is the real cost. No auto-vectorization, no loop unrolling beyond what gcc/clang does on their own, no polyhedral analysis.

For Aether's use case - concurrent actor systems where the bottleneck is usually message passing, not arithmetic - this is an acceptable tradeoff. For a language targeting HPC or numerical computing, it would be a dealbreaker.

What the compiler does optimize

Even without LLVM, the Aether compiler handles constant folding, dead code elimination, and tail recursion optimization. It also does something I haven't seen in other compilers: arithmetic series loop collapse - replacing counter-summing loops with O(1) closed-form expressions using the triangular number formula.

Not groundbreaking, but a neat trick that fell out of the constant folding pass.

The actor runtime

The real performance work is in the runtime: lock-free SPSC ring buffers for message passing, thread-local payload pools (256 pre-allocated buffers per thread), adaptive batch processing (64-1024 messages per cycle), and NUMA-aware actor placement.

This is where Aether lives or dies - and compiling to C actually helps here, because the runtime itself is C and there's zero boundary between language runtime and generated code.

Current state

Aether is at v0.31. 347 commits, 261 tests, a CLI with init/compile/test/run/build commands, REPL, LSP, and VS Code support. It cross-compiles to WebAssembly and embedded ARM. Benchmarked against 11 languages including C, Go, Rust, Erlang, and Pony.

Still missing closures, a proper package registry, and a source formatter. Building in the open.

If any of this sounds interesting: https://github.com/nicolasmd87/aether

Top comments (0)