DEV Community

Cover image for Episode 1: Reimagining Assembly for the 21st Century
Alexandru Biscoveanu
Alexandru Biscoveanu

Posted on • Edited on

Episode 1: Reimagining Assembly for the 21st Century

It started with a file called grammar.txt. I opened VS Code, typed the first rule, and thought: okay… now what?

That moment felt familiar. The cursor blinking, the code editor open, the idea too big for the page. I wasn’t just trying to write a program, I was trying to design a language. And not just any language. One that would reimagine how we write something many have long left untouched: assembly.

Let me back up.

Why Assembly?

I've always had a deep love for low-level programming, the kind where you feel the metal, every memory address, every instruction, every register. There’s a thrill in commanding the machine directly, watching each clock tick count. That’s the charm of working close to the hardware. That’s what drew me to assembly.

The Spark: LLVM IR

This all started during a research project in my QA class at university. I was working with LLVM IR, the intermediate representation used by the LLVM compiler infrastructure. For most students, it was just another layer of the toolchain. For me, it felt like a secret door, like someone left a raw sketch of how modern code becomes machine logic, and nobody bothered to make it usable.

And that’s what started a thought I couldn’t shake:

What if we didn’t treat IR as just a step in the compilation process? What if it was the programming language — a real alternative for those who love assembly, but want their code to be portable?

Not in the way people write inline assembly for a quick optimization. I mean: what if we made a real language, with structure, modularity, and sanity, that keeps the low-level precision of IR, but is something you can actually live in?

Designing a Language from Scratch

In my last post, Buffing A 50 Year Old Programming Language, I explored Lex and Yacc, and I have also taken a compiler course at university where we used Flex, Bison, and LLVM.

Personally, I am not a fan of Flex and Bison. The structure of Lex and Yacc files is hard to read and feels cumbersome to work with. Because of that, I decided to build my own lexer and parser instead, giving me more control and flexibility.

The feedback from my last post encouraged me to think more creatively about the language design. As I mentioned earlier, this whole idea was sparked by my fascination with LLVM IR and the possibilities it opened up.

The Hardest Part Was the Structure

The first thing you realize when designing a language based on IR is that IR isn’t meant for humans. It’s flat, it’s verbose, and while it’s consistent, it lacks narrative. It’s like trying to write a novel using only CSV files. You can’t express relationships, intent, or abstraction easily. There’s no scaffolding to build upon.

In traditional assembly, the challenge isn’t just writing operations, it’s organizing them. Structuring a readable flow from instruction to instruction feels like carving logic into stone tablets. You can’t see the shape of your code at a glance. I wanted to change that.

So I started borrowing structure from high-level languages. Not the syntax, necessarily, but the ideas: functions with type signatures, modules with imports, structured types, even lambdas and macros. The goal wasn’t to simplify IR, it was to make it expressive without sacrificing control.

The Language (So Far)

The grammar grew. I started defining literals, types, structs, and eventually entire modules. I borrowed LLVM’s naming conventions, inspired by how % and @ make local and global identifiers visually distinct. The syntax evolved to be declarative but deterministic, prioritizing clarity and predictability.

Instructions and Types

Instructions follow this form:

operation identifier : type = operands
Enter fullscreen mode Exit fullscreen mode

Example:

instructions

Every instruction uses Static Single Assignment (SSA) form: one instruction, one output. This avoids ambiguity and makes data flow analysis straightforward. Types are explicitly defined, no inference, so you always know what kind of value you’re working with. Even non-standard widths like i17 are supported, reflecting LLVM’s flexibility. In general, types follow the iN format.

Functions

Personally I'm a fan of how more languages tend to go down the route of writing functions as:

functions

But considering I am getting my inspiration from LLVM I define them as so:

functions

Functions are global values, marked with @. Parameters and return types are explicitly annotated. Each block (like entry) scopes instructions, making control flow clear and manageable.

(Notice how the function layout mirrors instructions. Keeps things coherent and easy to follow.)

Higher‑Order Functions and Lambdas

Functions are treated as first-class values in the language. This means they can be passed around, returned, and stored, just like any other value. This opens up possibilities for expressive and abstract programming patterns typically reserved for high-level functional languages.

Lambdas are anonymous functions you can define inline, enabling concise, localized behavior without the overhead of a full function declaration. Combined with higher-order functions, functions that take other functions as parameters or return them, you get a powerful composition model. This lets you build logic in small, testable, and reusable blocks.

For example, @higher_order accepts a function as input and invokes it. The @example function defines a lambda that squares its input and passes it to the higher-order function:

Higher‑Order Functions and Lambdas

By enabling this functional flexibility, even a systems level language can support modern design patterns like callbacks, deferred execution, and runtime code assembly all while staying type-safe and deterministic.

Generics and Macros

Low level code is often riddled with repetition. Swapping two values of type i32, i64, or even a pointer all requires separate boilerplate unless you have a mechanism for reuse. That’s where generics and macros come in.

Generics provide compile time type abstraction. You can write one macro for any type T and the compiler will generate the appropriate concrete version when used. This avoids copy pasting instruction sequences with tiny tweaks.

Macros take that idea further: they’re like inline templates that can represent not just expressions, but control flow and instruction blocks. Unlike runtime polymorphism, these constructs are fully expanded at compile time, making them efficient and predictable, perfect for systems programming.

Here’s a simple generic macro that swaps two values:

Generics and Macros

These tools help strike a balance: the expressiveness and flexibility of abstraction, without giving up the deterministic output that low-level work demands.

Protocol-Oriented Programming

In a high-level language, interfaces and traits provide abstraction without inheritance. In low-level languages, we often lose that structure and fall back to manual boilerplate. Protocols in this language bring some of that higher-order organization down to the metal.

A protocol defines a contract, a set of function signatures that a type must implement. In this language, all protocol conformance is resolved at compile time. There is no runtime dispatch, no vtables, and no hidden indirection. The result is code that is just as efficient as if it had been written manually, while still offering the clarity and reuse of higher-level abstractions. Protocols enable powerful organization and type safety without giving up control or performance.

This lets you build reusable libraries and algorithms that work on any type conforming to a given protocol, like an iterable container or a serializable value, without sacrificing performance or control.

Here’s a protocol for iterables:

Protocol-Oriented Programming

And an implementation:

Protocol-Oriented Programming

This pattern encourages a declarative and type safe way of building abstractions, supporting reuse without dynamic allocation, virtual calls, or unpredictable branching.

Algebraic Types and Enums

Algebraic data types allow expressing rich control flow and states:

Algebraic Types and Enums

You can destructure variants cleanly with switch, matching on semantics, not structure.

Namespaces and Modules

As the codebase grows, especially in a language targeting low-level systems, organizing functionality becomes vital. Without a coherent system of separation, large IR based programs can quickly descend into chaos.

That’s where namespaces and modules step in. They allow you to group related functions, types, and constants into logical units, enabling better organization and preventing naming collisions.

Each file can define multiple namespaces, and code can be explicitly imported and referenced using qualified paths. This design mirrors what we see in languages like Rust, while maintaining the minimal overhead required for low-level output.

Consider this module:

Namespaces and Modules

And another with a nested namespace:

Namespaces and Modules

Then, in the main file:

Namespaces and Modules

This lets you think about system boundaries and interfaces, even at the IR level, making it easier to collaborate, test, and scale your codebase across modules.

Where This Is Going

This language isn’t meant to replace established systems languages like Rust or C. Instead, it aims to bring structure, clarity, and usability to a programming layer that’s traditionally difficult to work with.

Right now, I’m focused on completing the lexer and refining the grammar. This is just the surface of the language, there’s more in the works, like GPU instruction support, which I also plan to add to better extend the language for low-level compute workloads.

This is still early stage work. The syntax is evolving, the semantics are being tested, and the implementation is growing week by week.

And for a 77 year old programming model, that feels like a fresh coat of paint.

I’d love to hear your thoughts, ideas, or any feedback you have.

Top comments (0)