<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Mierenik</title>
    <description>The latest articles on DEV Community by Mierenik (@gummyniki).</description>
    <link>https://dev.to/gummyniki</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F4004230%2Fdbf91d21-474c-4482-b5a9-cc5d83cfc3c9.jpg</url>
      <title>DEV Community: Mierenik</title>
      <link>https://dev.to/gummyniki</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/gummyniki"/>
    <language>en</language>
    <item>
      <title>I built a Programming Language</title>
      <dc:creator>Mierenik</dc:creator>
      <pubDate>Fri, 26 Jun 2026 15:26:36 +0000</pubDate>
      <link>https://dev.to/gummyniki/i-built-a-programming-language-33oi</link>
      <guid>https://dev.to/gummyniki/i-built-a-programming-language-33oi</guid>
      <description>&lt;h2&gt;
  
  
  What is Techlang?
&lt;/h2&gt;

&lt;p&gt;Techlang is a compiled, statically typed programming language &lt;br&gt;
with a C-like syntax and modern features.&lt;br&gt;
It compiles directly to native binaries using LLVM, &lt;br&gt;
the same compiler infrastructure used by languages like Rust, &lt;br&gt;
Swift, and Clang.&lt;/p&gt;

&lt;p&gt;Here's what a basic Techlang program looks like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;!import(std.tec) as std;

struct person = {
    int age;
    string name;
}

function greet(person p) returns none {
    std.print_string(p.name);
}

function main() returns none {
    person p;
    p.age = 25;
    p.name = "Alice";
    greet(p);
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The language supports:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Basic types (int, float, double, char, string, bool)&lt;/li&gt;
&lt;li&gt;Arrays and pointers&lt;/li&gt;
&lt;li&gt;Structs and enums&lt;/li&gt;
&lt;li&gt;Functions with parameters and return types&lt;/li&gt;
&lt;li&gt;If/else, while, and for loops&lt;/li&gt;
&lt;li&gt;A module system with file imports&lt;/li&gt;
&lt;li&gt;A standard library&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Why did I build this?
&lt;/h2&gt;

&lt;p&gt;A while back I had an idea: what if two programming languages &lt;br&gt;
could talk to each other natively, with zero overhead?&lt;br&gt;
Not through some painful FFI system, but as a first class feature.&lt;/p&gt;

&lt;p&gt;I wanted to build two languages that shared the same type system &lt;br&gt;
and could call each other's functions directly.&lt;br&gt;
That idea stuck with me, and eventually I decided to just start building.&lt;/p&gt;

&lt;p&gt;The first step was making one language that actually works.&lt;br&gt;
So that's what this blog is about.&lt;/p&gt;


&lt;h2&gt;
  
  
  How a compiler works
&lt;/h2&gt;

&lt;p&gt;Before getting into the details, here's a quick overview of &lt;br&gt;
what a compiler actually does.&lt;/p&gt;

&lt;p&gt;When you write source code and compile it, it goes through &lt;br&gt;
several stages:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Lexing&lt;/strong&gt; — The source code (raw text) gets broken into tokens.&lt;br&gt;
&lt;code&gt;int x = 5 + 3;&lt;/code&gt; becomes:&lt;br&gt;
&lt;code&gt;[KW_INT] [IDENTIFIER: x] [EQUALS] [INT_LITERAL: 5] [PLUS] [INT_LITERAL: 3] [SEMICOLON]&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Parsing&lt;/strong&gt; — The tokens get turned into an Abstract Syntax Tree (AST),&lt;br&gt;
a tree structure that represents the program's structure.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Semantic Analysis&lt;/strong&gt; — The AST gets checked for logical errors —&lt;br&gt;
type mismatches, undefined variables, wrong number of arguments, etc.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4. IR Generation&lt;/strong&gt; — The AST gets lowered to LLVM IR,&lt;br&gt;
a portable assembly-like language that LLVM can optimize and &lt;br&gt;
compile to any target.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;5. Code Generation&lt;/strong&gt; — LLVM takes the IR and produces a native binary.&lt;/p&gt;

&lt;p&gt;I built all of these from scratch in C++.&lt;/p&gt;


&lt;h2&gt;
  
  
  Building the Lexer
&lt;/h2&gt;

&lt;p&gt;The lexer was the first thing I built, and honestly the most &lt;br&gt;
straightforward part.&lt;/p&gt;

&lt;p&gt;The idea is simple, walk through the source code character by &lt;br&gt;
character and group them into tokens. &lt;br&gt;
Keywords like &lt;code&gt;int&lt;/code&gt; and &lt;code&gt;function&lt;/code&gt;, identifiers like variable names, &lt;br&gt;
literals like &lt;code&gt;42&lt;/code&gt; and &lt;code&gt;"hello"&lt;/code&gt;, and operators like &lt;code&gt;+&lt;/code&gt; and &lt;code&gt;==&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;The trickiest part was two-character operators like &lt;code&gt;==&lt;/code&gt;, &lt;code&gt;+=&lt;/code&gt;, &lt;code&gt;!=&lt;/code&gt;.&lt;br&gt;
You need to look one character ahead before deciding what token &lt;br&gt;
you're making — otherwise you'd emit &lt;code&gt;=&lt;/code&gt; and then &lt;code&gt;=&lt;/code&gt; separately &lt;br&gt;
instead of &lt;code&gt;==&lt;/code&gt;.&lt;/p&gt;


&lt;h2&gt;
  
  
  Building the Parser
&lt;/h2&gt;

&lt;p&gt;The parser takes the flat list of tokens and turns them into a tree.&lt;/p&gt;

&lt;p&gt;I used a &lt;strong&gt;recursive descent parser&lt;/strong&gt;, one function per language &lt;br&gt;
construct, each calling the others recursively.&lt;br&gt;
For example, &lt;code&gt;parseStatement()&lt;/code&gt; calls &lt;code&gt;parseIfStatement()&lt;/code&gt;, &lt;br&gt;
which calls &lt;code&gt;parseExpression()&lt;/code&gt;, which calls &lt;code&gt;parseAddSub()&lt;/code&gt;, &lt;br&gt;
which calls &lt;code&gt;parseMulDiv()&lt;/code&gt;, and so on.&lt;/p&gt;

&lt;p&gt;The most interesting part of the parser is &lt;strong&gt;operator precedence&lt;/strong&gt;.&lt;br&gt;
Making sure &lt;code&gt;2 + 3 * 4&lt;/code&gt; evaluates as &lt;code&gt;2 + (3 * 4)&lt;/code&gt; and not &lt;br&gt;
&lt;code&gt;(2 + 3) * 4&lt;/code&gt; requires splitting expression parsing into multiple &lt;br&gt;
layers, one per precedence level.&lt;br&gt;
Higher precedence operators get parsed deeper in the call chain, &lt;br&gt;
which naturally makes them bind tighter.&lt;/p&gt;


&lt;h2&gt;
  
  
  The Semantic Analyzer
&lt;/h2&gt;

&lt;p&gt;Once I had an AST, I needed to check that the program actually &lt;br&gt;
made sense, not just syntactically, but logically.&lt;/p&gt;

&lt;p&gt;The semantic analyzer walks the AST and checks things like:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Is this variable declared before it's used?&lt;/li&gt;
&lt;li&gt;Does this function call have the right number of arguments?&lt;/li&gt;
&lt;li&gt;Are you trying to assign a string to an int?&lt;/li&gt;
&lt;li&gt;Is this variable marked const but being reassigned?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The core data structure is a &lt;strong&gt;symbol table&lt;/strong&gt;, a scoped stack &lt;br&gt;
of maps that tracks every declared variable and function.&lt;br&gt;
When you enter a block, you push a new scope.&lt;br&gt;
When you leave, you pop it.&lt;br&gt;
Looking up a variable searches from the innermost scope outward,&lt;br&gt;
which is how variable shadowing works.&lt;/p&gt;


&lt;h2&gt;
  
  
  LLVM IR Generation
&lt;/h2&gt;

&lt;p&gt;This is where things get really interesting.&lt;/p&gt;

&lt;p&gt;LLVM IR is a typed, portable assembly language. &lt;br&gt;
Instead of targeting x86 or ARM directly, I lower my AST to LLVM IR &lt;br&gt;
and let LLVM handle the rest — optimization, register allocation, &lt;br&gt;
machine code generation for any target architecture.&lt;/p&gt;

&lt;p&gt;For example, this Techlang function:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;function add(int a, int b) returns int {
    return a + b;
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Becomes this LLVM IR:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight llvm"&gt;&lt;code&gt;&lt;span class="k"&gt;define&lt;/span&gt; &lt;span class="kt"&gt;i32&lt;/span&gt; &lt;span class="vg"&gt;@add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;i32&lt;/span&gt; &lt;span class="nv"&gt;%a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;i32&lt;/span&gt; &lt;span class="nv"&gt;%b&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="nl"&gt;entry:&lt;/span&gt;
    &lt;span class="nv"&gt;%a.addr&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;alloca&lt;/span&gt; &lt;span class="kt"&gt;i32&lt;/span&gt;
    &lt;span class="nv"&gt;%b.addr&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;alloca&lt;/span&gt; &lt;span class="kt"&gt;i32&lt;/span&gt;
    &lt;span class="k"&gt;store&lt;/span&gt; &lt;span class="kt"&gt;i32&lt;/span&gt; &lt;span class="nv"&gt;%a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;i32&lt;/span&gt;&lt;span class="p"&gt;*&lt;/span&gt; &lt;span class="nv"&gt;%a.addr&lt;/span&gt;
    &lt;span class="k"&gt;store&lt;/span&gt; &lt;span class="kt"&gt;i32&lt;/span&gt; &lt;span class="nv"&gt;%b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;i32&lt;/span&gt;&lt;span class="p"&gt;*&lt;/span&gt; &lt;span class="nv"&gt;%b.addr&lt;/span&gt;
    &lt;span class="nv"&gt;%a.val&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;load&lt;/span&gt; &lt;span class="kt"&gt;i32&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;i32&lt;/span&gt;&lt;span class="p"&gt;*&lt;/span&gt; &lt;span class="nv"&gt;%a.addr&lt;/span&gt;
    &lt;span class="nv"&gt;%b.val&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;load&lt;/span&gt; &lt;span class="kt"&gt;i32&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;i32&lt;/span&gt;&lt;span class="p"&gt;*&lt;/span&gt; &lt;span class="nv"&gt;%b.addr&lt;/span&gt;
    &lt;span class="nv"&gt;%addtmp&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;add&lt;/span&gt; &lt;span class="kt"&gt;i32&lt;/span&gt; &lt;span class="nv"&gt;%a.val&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;%b.val&lt;/span&gt;
    &lt;span class="k"&gt;ret&lt;/span&gt; &lt;span class="kt"&gt;i32&lt;/span&gt; &lt;span class="nv"&gt;%addtmp&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Which then compiles to a native binary that runs directly on your CPU.&lt;/p&gt;

&lt;p&gt;The trickiest part of IR generation was control flow: &lt;br&gt;
if statements, while loops, and for loops all require creating &lt;br&gt;
multiple &lt;strong&gt;basic blocks&lt;/strong&gt; and branching between them.&lt;br&gt;
A basic block is a straight-line sequence of instructions &lt;br&gt;
with no branches in the middle — every block must end with &lt;br&gt;
either a return or a branch to another block.&lt;/p&gt;

&lt;p&gt;One bug that took me a while to track down: &lt;br&gt;
if a function has an early return inside an if block, &lt;br&gt;
the &lt;code&gt;ret&lt;/code&gt; instruction acts as the block terminator.&lt;br&gt;
But my code was then trying to add a &lt;code&gt;br&lt;/code&gt; instruction &lt;br&gt;
to jump to the merge block — which is invalid because &lt;br&gt;
you can't have two terminators in the same block.&lt;br&gt;
The fix was to check &lt;code&gt;getTerminator()&lt;/code&gt; before adding any branch.&lt;/p&gt;


&lt;h2&gt;
  
  
  The Standard Library
&lt;/h2&gt;

&lt;p&gt;Techlang has a standard library (&lt;code&gt;std.tec&lt;/code&gt;) that provides &lt;br&gt;
basic functions like printing, reading input, math, and casting.&lt;/p&gt;

&lt;p&gt;The implementation was interesting, the standard library is &lt;br&gt;
written in C (&lt;code&gt;std.c&lt;/code&gt;), compiled to an object file (&lt;code&gt;stdlib.o&lt;/code&gt;), &lt;br&gt;
and linked with the compiled Techlang program at the end.&lt;/p&gt;

&lt;p&gt;Techlang functions can map to C implementations using the &lt;br&gt;
&lt;code&gt;extern&lt;/code&gt; keyword:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;function print_int(int x) returns none extern "tec_print_int" {}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This tells the compiler to call &lt;code&gt;tec_print_int&lt;/code&gt; from the C &lt;br&gt;
standard library when &lt;code&gt;std.print_int()&lt;/code&gt; is called in Techlang.&lt;/p&gt;


&lt;h2&gt;
  
  
  The Module System
&lt;/h2&gt;

&lt;p&gt;One of my favourite features is the import system.&lt;br&gt;
You can split code across multiple &lt;code&gt;.tec&lt;/code&gt; files and import them:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;
techlang
!import(math.tec) as math;

int result = math.add(3, 4);
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When the compiler sees an import, it lexes and parses the &lt;br&gt;
imported file, extracts its declarations, prefixes them with &lt;br&gt;
the alias (&lt;code&gt;math.add&lt;/code&gt;, &lt;code&gt;math.multiply&lt;/code&gt;, etc), &lt;br&gt;
and injects them into the current program before compilation.&lt;/p&gt;

&lt;p&gt;Since everything ends up in the same LLVM module, &lt;br&gt;
cross-file function calls have zero overhead.&lt;/p&gt;




&lt;h2&gt;
  
  
  Hardest Bugs
&lt;/h2&gt;

&lt;p&gt;A few bugs that gave me real trouble:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The opaque pointer problem&lt;/strong&gt; — LLVM 17+ removed typed pointers.&lt;br&gt;
In older LLVM you could ask a pointer "what type do you point to?"&lt;br&gt;
In newer LLVM, a pointer is just a pointer with no type info attached.&lt;br&gt;
I had to build a separate &lt;code&gt;pointerTypeScopes&lt;/code&gt; system to track &lt;br&gt;
pointee types manually throughout compilation.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The fallthrough bug&lt;/strong&gt; — I had two &lt;code&gt;case&lt;/code&gt; statements in a switch &lt;br&gt;
with missing &lt;code&gt;break&lt;/code&gt; statements. &lt;br&gt;
When an enum declaration was generated, it fell through into &lt;br&gt;
struct instance generation and crashed.&lt;br&gt;
Classic C++ gotcha that cost me way more time than it should have.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The array parameter type mismatch&lt;/strong&gt; — When passing arrays to &lt;br&gt;
functions, LLVM sees a &lt;code&gt;[5 x i32]&lt;/code&gt; (fixed size) at the call site &lt;br&gt;
but the function signature expects &lt;code&gt;[0 x i32]&lt;/code&gt; (unknown size).&lt;br&gt;
The fix was to treat array parameters as pointers to the element &lt;br&gt;
type, which is exactly how C handles it under the hood.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The std library implementation issue&lt;/strong&gt; — At first All std functions were hardcoded&lt;br&gt;
in the IR generator, which worked for a bit but then started giving a lot of errors, &lt;br&gt;
functions like &lt;code&gt;print&lt;/code&gt; being declared multiple times, parameter type missmatches,&lt;br&gt;
and eventually I decided to completely rewrite the std library implementation to use a&lt;br&gt;
.tec file with all the definitions that connects to a C implementation.&lt;/p&gt;




&lt;h2&gt;
  
  
  What's next for Techlang?
&lt;/h2&gt;

&lt;p&gt;Techlang is just getting started.&lt;br&gt;
The immediate roadmap includes a VS Code extension for syntax &lt;br&gt;
highlighting, more standard library functions, and a package manager.&lt;/p&gt;

&lt;p&gt;But the big one, the reason I started this project in the first &lt;br&gt;
place, is still coming.&lt;/p&gt;

&lt;p&gt;I'm planning a &lt;strong&gt;companion GPU compute language&lt;/strong&gt; that integrates &lt;br&gt;
directly with Techlang.&lt;br&gt;
The idea is that you write your main logic in Techlang and your &lt;br&gt;
parallel compute kernels in the companion language, &lt;br&gt;
and they can call each other natively with zero overhead since &lt;br&gt;
they both compile to LLVM IR.&lt;/p&gt;

&lt;p&gt;I haven't seen anyone else do this, and I think it could be &lt;br&gt;
genuinely exciting.&lt;br&gt;
More on that in a future blog.&lt;/p&gt;




&lt;p&gt;That's about it! This was by far the most complex project I've &lt;br&gt;
ever built, and also the most rewarding.&lt;br&gt;
If you're interested in compilers I highly recommend trying to &lt;br&gt;
build one — even a tiny one.&lt;br&gt;
You'll learn more about how programming languages actually work &lt;br&gt;
than any book or course can teach you.&lt;/p&gt;

&lt;h2&gt;
  
  
  Techlang GitHub page &lt;a href="https://github.com/gummyniki/techlang" rel="noopener noreferrer"&gt;here&lt;/a&gt;
&lt;/h2&gt;

</description>
      <category>programming</category>
      <category>cpp</category>
    </item>
  </channel>
</rss>
