DEV Community

Cover image for I Thought Compilers Were Scary. So I Built Sauce.
mitali
mitali

Posted on

I Thought Compilers Were Scary. So I Built Sauce.

I’ve been writing code for years. I type cargo run or npm start, hit enter, and meaningful things happen. But if you asked me what actually happens between hitting enter and seeing "Hello World," I’d have mumbled something about "machine code" and changed the subject.

That bothered me. I rely on these tools every day, but I didn't understand them.

So I started building Sauce, my own programming language in Rust. Not because the world needs another language, but because I needed to stop treating my compiler like a black box.

Turns out, a language isn't magic. It's just a pipeline.

Why We Think It's Hard

We usually see a compiler as this big, scary brain that judges our code. You feed it text, and it either gives you a working program or yells at you with an error.

I spent years thinking you needed to be a math genius to build one. I was wrong. You just need to break it down into small steps.

What Sauce Actually Is

Strip away the hype, and a programming language is just a way to move data around.

Sauce is a statically typed language that feels like a simple script. I wanted something that was clear and honest. The core ideas are simple:

  • Pipelines (|>) are default: Data flows explicitly from one step to the next, like a factory line.
  • Effects are explicit (toss): No hidden surprises or secret jumps in your code.

But to get there, I had to build the engine. And thanks to Rust, I learned that the engine is actually pretty cool.

The Architecture: It’s Just an Assembly Line

I used to think compilation was one giant, messy function. In reality, it’s a disciplined process. I’m following a strict architecture for Sauce that separates "understanding the code" (Frontend) from "running the code" (Backend).

saucr architecture

Here is exactly how Sauce works under the hood:

The diagram above isn't just a sketch; it's the map I'm building against. It breaks down into two main phases.

Phase 1: Frontend (The Brain)

This phase is all about understanding what you wrote. It doesn't run anything yet; it just reads and checks.

  1. Lexer (Logos): The Chopping Block
  2. The Job: Computers don't read words; they read characters. The Lexer's job is to group those characters into meaningful chunks called "Tokens."
  3. In Plain English: Imagine reading a sentence without spaces: thequickbrownfox. It's hard. The Lexer adds the spaces and labels every word. It turns grab x = 10 into a list: [Keyword(grab), Ident(x), Symbol(=), Int(10)].
  4. The Tool: I used a Rust tool called Logos. It’s incredibly fast, but I learned a hard lesson: computers are dumb. If you don't explicitly tell them that "grab" is a special keyword, they might think it's just a variable name like "green." You have to set strict rules.

  5. Parser (Chumsky): The Grammar Police

  6. The Job: Now that we have a list of words (tokens), we need to check if they make a valid sentence. The Parser organizes these flat lists into a structured tree called the AST (Abstract Syntax Tree).

  7. In Plain English: A list of words like [10, =, x, grab] contains valid words but makes no sense. The Parser ensures the order is correct (grab x = 10) and builds a hierarchy: "This is a Variable Assignment. The name is 'x'. The value is '10'."

  8. The Tool: I used Chumsky, which lets you build logic like LEGOs. You write a tiny function to read a number, another for a variable, and glue them together.

  9. The "Aha!" Moment: I learned how much structure matters. Instead of trying to parse everything in one giant loop, breaking the grammar into small, composable pieces made the language way easier to extend and reason about. It’s not magic; it’s just organizing data.

  10. Type Checking: The Logic Check

  11. The Job: Just because a sentence is grammatically correct doesn't mean it makes sense. "The sandwich ate the Tuesday" is a valid sentence, but it's nonsense. The Type Checker catches these logical errors.

  12. In Plain English: If you write grab x = "hello" + 5, the Parser says "Looks like a valid math operation!" But the Type Checker steps in and says, "Wait. You can't add a Word to a Number. That's illegal." Sauce currently has a small, explicit system that catches these basic mistakes before you ever try to run the code.

Phase 2: Backend (The Muscle)

Once the Frontend gives the "thumbs up," we move to the Backend. This phase is about making the code actually run.

  1. Codegen (Inkwell/LLVM): The Translator
  2. The Job: This is where we leave the high-level world of "Variables" and "Pipelines" and enter the low-level world of CPU instructions. We translate our AST into LLVM IR (Intermediate Representation).
  3. In Plain English: Sauce is like a high-level manager giving orders ("Calculate this pipeline"). The CPU is the worker who only understands basic tasks ("Move number to register A," "Add register A and B"). LLVM is the translator that turns the manager's orders into the worker's checklist.
  4. Why LLVM? It's the same industrial-grade machinery that powers Rust, Swift, and C++. By using it, Sauce gets decades of optimization work for free. Once you figure out how to tell LLVM to "print a number," the rest stops feeling so scary.

  5. Native Binary: The Final Product

  6. The Job: The final step is bundling all those CPU instructions into a standalone file (like an .exe on Windows or a binary on Linux).

  7. In Plain English: This is what lets you send your program to a friend. They don't need to install Sauce, Rust, or anything else. They just double-click the file, and it runs. (Currently, this works for simple, effect-free programs).

What Works Right Now (v0.1.0)

Sauce isn't just an idea anymore. The core compiler pipeline is alive.

  • Pipelines: You can write grab x = 10 |> _ and it understands it perfectly. The data flows left-to-right, just like reading a sentence.
  • Real Output: I can feed it real .sauce source code, and it parses it into a type-safe syntax tree.
  • Explicit Effects: You can use toss to signal a side effect. This currently works in the interpreter, while the LLVM backend intentionally rejects effects for now.

The Road Ahead

I have a clear plan for where this is going. Since the core architecture is stable, the next updates are about making it usable.

  • v0.1.x (UX): Right now, if you make a mistake, the error messages are a bit cryptic. I'm adding a tool called Ariadne to give pretty, helpful error reports (like Rust does).
  • v0.2.0 (Effects): This is the big one. I'll be finalizing how "Effects" work—defining rules for when you can resume a program after an error and when you have to abort.
  • v0.3.0 (Runtime): Merging the Interpreter and LLVM worlds so they behave exactly the same, plus adding a standard library so you can do more than just print numbers.

Why You Should Try This

I avoided building a language for years because I thought I wasn't smart enough.

But building Sauce taught me that there's no magic. It's just data structures. A Lexer is just regex. A Parser is just a tree builder. An Interpreter is just a function that walks through that tree.

If you want to actually understand how your code runs, don't just read a book. Build a tiny, broken compiler. Create a Lexer. Define a simple tree. Parse 1 + 1.

You'll learn more in a weekend of fighting syntax errors than you will in a year of just using cargo run.

Check out Sauce on GitHub. It's small, it's honest, and we are officially cooking.

Top comments (0)