Compilers used to feel like magic to me. Not the fun kind of magic, but the intimidating kind - like something only computer science wizards could understand. Maybe it was that dragon on the cover of Aho's famous compiler textbook, or because I associated compilers with the likes of Ken Thompson and Bjarne Stroustrup.
Today many amazing tutorials exist that guide the developer all the way from opening their first text editor to outputting machine code, but I feel like many of them either leave you with a trivial implementation that's hard to extend into a truly interesting language, or are too complex to be your first experience.
Why This Tutorial?
This tutorial is inspired by the excellent LLVM Kaleidoscope tutorial for C++, but with some key differences:
- Python instead of C++: More accessible syntax, faster iteration
- x86-64 assembly output: LLVM isn't the first step - learning to work with registers, calling conventions, and assembly gives you foundational knowledge that makes tools like LLVM much more meaningful
- And most importantly, a code-structure that is easily extensible and won't fall apart once you try to take it to the next step
What We're Building
We'll be creating a compiler for YFC (Your First Compiler), a simple functional language with functions, conditionals, loops, and arithmetic. Here's what a factorial function looks like in YFC:
# Computes the factorial of n using a simple recursive definition:
func factorial(n) :
if n < 2
1
else
n * factorial(n - 1)
It's minimal but powerful enough to write interesting programs - and more importantly, simple enough that we can focus on understanding how compilers work rather than getting lost in language complexity.
You don't have to finish this entire tutorial to get value from it. You can stop at any stage and walk away with a cool little project to tinker with. The farther you go, the deeper we'll dive into advanced topics like compiler optimizations, register allocation algorithms, and type systems - but don't worry about any of that yet.
Code Repository
All accompanying code for this tutorial series is available on GitHub. The repository includes complete source code for each step, example programs, and additional resources to help you follow along.
Top comments (0)