<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Pi</title>
    <description>The latest articles on DEV Community by Pi (@pie-314).</description>
    <link>https://dev.to/pie-314</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3790408%2F0b12c7e2-c62e-46d2-9e58-992c9475a1fc.png</url>
      <title>DEV Community: Pi</title>
      <link>https://dev.to/pie-314</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/pie-314"/>
    <language>en</language>
    <item>
      <title>Lexers</title>
      <dc:creator>Pi</dc:creator>
      <pubDate>Sun, 07 Jun 2026 14:25:27 +0000</pubDate>
      <link>https://dev.to/pie-314/lexers-622</link>
      <guid>https://dev.to/pie-314/lexers-622</guid>
      <description>&lt;h1&gt;
  
  
  what is lexer ?
&lt;/h1&gt;

&lt;p&gt;A lexer is a component that converts raw source code characters into tokens for the parser.&lt;/p&gt;

&lt;p&gt;when compiler gets&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;    .
    .
    total = price + 42;
    .
    .
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;It doesn't know what it is ? for the compiler it's just stream of characters, without any classification&lt;/p&gt;

&lt;p&gt;&lt;code&gt;'t' 'o' 't' 'a' 'l' ' ' '=' ' ' 'p' ...&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;now we have two options, either we make meaning out of this now or during later steps add a extra hurdle to handle each character.&lt;/p&gt;

&lt;p&gt;If you are the smart one (unlike me), you will think that handling each character later separately is tedious, so we should handle it now and set some rules about what is allowed and what is not.&lt;/p&gt;

&lt;p&gt;Lexer is this small set of instructions which can classify a stream of characters into tokens, not to confuse it with grammar, lexer doesn't know rights and wrongs of language.&lt;/p&gt;

&lt;p&gt;if take example of standard english.&lt;br&gt;
&lt;code&gt;jump3d D0g m004. ov3r&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;The lexer knows that according to the rules we defined, words can't have numbers, so it can't classify this as a valid word.&lt;/p&gt;

&lt;p&gt;and if we write&lt;br&gt;
&lt;code&gt;jumped Dog moon. over&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;The lexer knows this matches the patterns of valid words so it will make &lt;code&gt;Tokens&lt;/code&gt; out of this. It still doesn't know if sentence is right grammatically, it doesn't know rules of grammar, it just tries to classify stream of characters &lt;strong&gt;for each word.&lt;/strong&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;numbers in identifiers are valid in many languages, the above analogy is just an example to understand the logistics of a language.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;These meaningful words in a programming language are called Tokens.&lt;/p&gt;
&lt;h3&gt;
  
  
  what is a Token ?
&lt;/h3&gt;

&lt;p&gt;Smallest individual element of a program is called as Token. Most things you see inside a program are tokens.&lt;/p&gt;

&lt;p&gt;Usually compilers follow the format TokenType(Token) for representation&lt;/p&gt;

&lt;p&gt;so &lt;code&gt;total = price + 42&lt;/code&gt; becomes&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;[IDENT(total), ASSIGN(=), IDENT(price), PLUS(+),NUMBER(42)]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;spaces have no value here, there are languages like python which works on indentations and they follow slightly different rules.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;So if we take example of&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;fn add(a,b){
    price = 10;
    total = price + 42;
    print(total);
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Here the lexer will give output as&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;[
    FUNCTION(fn),
    IDENT(add),
    LPAREN((),
    IDENT(a),
    COMMA(,),
    IDENT(b),
    RPAREN()),
    LBRACE({),

    IDENT(price),
    ASSIGN(=),
    NUMBER(10),
    SEMICOLON(;),

    IDENT(total),
    ASSIGN(=),
    IDENT(price),
    PLUS(+),
    NUMBER(42),
    SEMICOLON(;),

    IDENT(print),
    LPAREN((),
    IDENT(total),
    RPAREN()),
    SEMICOLON(;),

    RBRACE(})
]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Below is example of Toy lexer&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;source&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
total = price + 42;
&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;

&lt;span class="n"&gt;tokens&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;

&lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
&lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;source&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;ch&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;source&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

    &lt;span class="c1"&gt;# Skip whitespace
&lt;/span&gt;    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;ch&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;isspace&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
        &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
        &lt;span class="k"&gt;continue&lt;/span&gt;

    &lt;span class="c1"&gt;# Identifier
&lt;/span&gt;    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;ch&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;isalpha&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="n"&gt;ch&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;_&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;start&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;

        &lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;source&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;source&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nf"&gt;isalnum&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="n"&gt;source&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;_&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;

        &lt;span class="n"&gt;value&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;source&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;start&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="n"&gt;tokens&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;IDENT&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
        &lt;span class="k"&gt;continue&lt;/span&gt;

    &lt;span class="c1"&gt;# Number
&lt;/span&gt;    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;ch&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;isdigit&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
        &lt;span class="n"&gt;start&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;

        &lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;source&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="n"&gt;source&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nf"&gt;isdigit&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
            &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;

        &lt;span class="n"&gt;value&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;source&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;start&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="n"&gt;tokens&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;NUMBER&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
        &lt;span class="k"&gt;continue&lt;/span&gt;

    &lt;span class="c1"&gt;# Operators
&lt;/span&gt;    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;ch&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;tokens&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ASSIGN&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ch&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="n"&gt;ch&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;+&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;tokens&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;PLUS&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ch&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="n"&gt;ch&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;;&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;tokens&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;SEMICOLON&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ch&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;tokens&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;UNKNOWN&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ch&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

    &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;

&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;token&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;tokens&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;token&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;GitHub: &lt;a href="https://github.com/pie-314/compiler-blogs/" rel="noopener noreferrer"&gt;https://github.com/pie-314/compiler-blogs/&lt;/a&gt;&lt;/p&gt;

</description>
      <category>beginners</category>
      <category>computerscience</category>
      <category>programming</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>How to write a compiler ?</title>
      <dc:creator>Pi</dc:creator>
      <pubDate>Sun, 07 Jun 2026 14:24:26 +0000</pubDate>
      <link>https://dev.to/pie-314/how-to-write-a-compiler--103p</link>
      <guid>https://dev.to/pie-314/how-to-write-a-compiler--103p</guid>
      <description>&lt;h1&gt;
  
  
  How to write a compiler ?
&lt;/h1&gt;

&lt;p&gt;This series of blogs acts as an analysis and deep dive into compiler design.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Currently I am building EEL. EEL is eBPF Language, this series of blogs consists of all the things I learned throughout the process of building it.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  What is a compiler?
&lt;/h2&gt;

&lt;p&gt;A compiler is a program that translates source code written in one language into another representation, usually machine code or an intermediate form that can be executed by a computer.&lt;/p&gt;

&lt;p&gt;Modern compilers are made up of several different pieces which come together as a puzzle to make a compiler.&lt;/p&gt;

&lt;p&gt;These pieces are:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Lexer&lt;/li&gt;
&lt;li&gt;Parser&lt;/li&gt;
&lt;li&gt;Semantic Analyzer&lt;/li&gt;
&lt;li&gt;IR Generator&lt;/li&gt;
&lt;li&gt;Optimizer&lt;/li&gt;
&lt;li&gt;Code Generator&lt;/li&gt;
&lt;li&gt;Assembler&lt;/li&gt;
&lt;li&gt;Linker&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;We will analyze each of these one by one in great detail.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Source Code
     ↓
   Lexer
     ↓
  Tokens
     ↓
  Parser
     ↓
    AST
     ↓
Semantic Analysis
     ↓
     IR
     ↓
 Optimizer
     ↓
Code Generator
     ↓
 Assembler
     ↓
   Linker
     ↓
 Executable
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Throughout the blogs we will take following pseudocode as reference:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;fn add(a,b){
    price = 10;
    total = price + 42;
    print(total);
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;GitHub: &lt;a href="https://github.com/pie-314/compiler-blogs/" rel="noopener noreferrer"&gt;https://github.com/pie-314/compiler-blogs/&lt;/a&gt;&lt;/p&gt;

</description>
      <category>compiling</category>
      <category>systems</category>
      <category>architecture</category>
      <category>softwaredevelopment</category>
    </item>
    <item>
      <title>How I Built a Crash-Safe Database Engine in C with Write-Ahead Logging and Snapshots</title>
      <dc:creator>Pi</dc:creator>
      <pubDate>Tue, 24 Feb 2026 22:40:39 +0000</pubDate>
      <link>https://dev.to/pie-314/how-i-built-a-crash-safe-database-engine-in-c-with-write-ahead-logging-and-snapshots-jpk</link>
      <guid>https://dev.to/pie-314/how-i-built-a-crash-safe-database-engine-in-c-with-write-ahead-logging-and-snapshots-jpk</guid>
      <description>&lt;p&gt;Most developers use databases every day. Few actually know what happens when the power goes out mid-write, or when a system crashes halfway through saving data. Yet when the database restarts, everything is still there. That reliability isn’t magic. It comes from careful engineering.&lt;/p&gt;

&lt;p&gt;I wanted to understand this at a deeper level, so I built RadishDB. It started as a simple in-memory key–value store in C. Over time, I added persistence, crash recovery, write-ahead logging, snapshots, TTL expiration, a TCP server, and Docker deployment.&lt;/p&gt;

&lt;p&gt;The goal wasn’t to compete with Redis. It was to understand how systems like Redis actually work under the hood.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why C?
&lt;/h2&gt;

&lt;p&gt;Since RadishDB is fundamentally a storage engine, performance and predictability matter a lot. I wanted full control over memory and disk behavior.&lt;/p&gt;

&lt;p&gt;C gives you:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;direct control over memory&lt;/li&gt;
&lt;li&gt;no garbage collector&lt;/li&gt;
&lt;li&gt;predictable performance&lt;/li&gt;
&lt;li&gt;minimal abstraction between code and hardware&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;When you're building a database, memory layout and disk writes are not abstract ideas. They are the system itself.&lt;/p&gt;

&lt;p&gt;This is also why many real databases like Redis, SQLite, and PostgreSQL are written in C. The language doesn’t hide anything. If something goes wrong, you can usually see exactly why.&lt;/p&gt;

&lt;p&gt;It also forces you to think carefully about every allocation, every pointer, and every write to disk.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Core: In-Memory Storage and Hashtable
&lt;/h2&gt;

&lt;p&gt;RadishDB stores all data in memory. This makes reads and writes extremely fast, since RAM access is much faster than disk access.&lt;/p&gt;

&lt;p&gt;To organize data efficiently, I used a hash table.&lt;/p&gt;

&lt;h3&gt;
  
  
  Hash Tables
&lt;/h3&gt;

&lt;p&gt;Hash tables allow fast lookup, insertion, and deletion, usually in constant time O(1).&lt;/p&gt;

&lt;p&gt;When a key is inserted, RadishDB computes a hash and maps it to a bucket.&lt;/p&gt;

&lt;p&gt;I used the djb2 hash function:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;unsigned long hash(const char *str) {
  unsigned long hash = 5381;
  for (int i = 0; str[i] != '\0'; i++) {
    hash = hash * 33 + str[i];
  }
  return hash;
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If multiple keys map to the same bucket, they are stored using separate chaining with a linked list.&lt;/p&gt;

&lt;p&gt;This keeps operations fast even when collisions occur.&lt;/p&gt;

&lt;p&gt;At this stage, RadishDB was fast, but fragile. Everything lived in memory. If the process crashed, all data was gone.&lt;/p&gt;

&lt;p&gt;That led to the next problem.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Problem: Surviving Crashes
&lt;/h2&gt;

&lt;p&gt;An in-memory database is fast, but memory disappears when the process stops.&lt;/p&gt;

&lt;p&gt;To solve this, I implemented Write-Ahead Logging (WAL) using an Append-Only File (AOF).&lt;/p&gt;

&lt;p&gt;The idea is simple but powerful.&lt;/p&gt;

&lt;p&gt;Every write operation is first written to disk before applying it to memory.&lt;/p&gt;

&lt;p&gt;For example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;SET name alice
DEL name
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;These commands are appended to a log file.&lt;/p&gt;

&lt;p&gt;If the database crashes, RadishDB reads this file during startup and replays the operations to rebuild memory.&lt;/p&gt;

&lt;p&gt;The log becomes the source of truth. Memory becomes a reconstructed state.&lt;/p&gt;

&lt;p&gt;This ensures durability.&lt;/p&gt;




&lt;h2&gt;
  
  
  AOF Rewrite: Log Compaction for Faster Recovery
&lt;/h2&gt;

&lt;p&gt;One problem with append-only logs is that they grow forever.&lt;/p&gt;

&lt;p&gt;For example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;SET x 1
SET x 2
SET x 3
SET x 4
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Only the final value matters.&lt;/p&gt;

&lt;p&gt;Similarly:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;SET x 1
DEL x
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The key no longer exists, but the log still contains both operations.&lt;/p&gt;

&lt;p&gt;Over time, this slows startup and wastes disk space.&lt;/p&gt;

&lt;p&gt;To fix this, RadishDB performs AOF rewrite.&lt;/p&gt;

&lt;p&gt;Instead of keeping the full history, it writes only the current state into a new file.&lt;/p&gt;

&lt;p&gt;The process works like this:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Create a temporary file&lt;/li&gt;
&lt;li&gt;Write current database state&lt;/li&gt;
&lt;li&gt;Flush to disk using fsync&lt;/li&gt;
&lt;li&gt;Atomically replace the old file using rename&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Rename is atomic on POSIX systems. This means even if a crash happens during rewrite, the database will always have a valid file.&lt;/p&gt;

&lt;p&gt;This ensures both safety and efficiency.&lt;/p&gt;




&lt;h2&gt;
  
  
  Snapshots: Faster Startup with .rdbx
&lt;/h2&gt;

&lt;p&gt;While AOF is great for durability, replaying a long log can take time.&lt;/p&gt;

&lt;p&gt;To solve this, I implemented snapshots using a custom binary format called &lt;code&gt;.rdbx&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;A snapshot stores the current state of the database, not the history.&lt;/p&gt;

&lt;p&gt;This makes it:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;smaller&lt;/li&gt;
&lt;li&gt;faster to load&lt;/li&gt;
&lt;li&gt;easier to transfer&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Snapshots are useful for backups and fast startup.&lt;/p&gt;

&lt;p&gt;AOF ensures durability. Snapshots ensure speed and portability.&lt;/p&gt;




&lt;h2&gt;
  
  
  From Storage Engine to Database Server
&lt;/h2&gt;

&lt;p&gt;At this point, RadishDB could store and recover data. But it wasn’t a real database server yet.&lt;/p&gt;

&lt;p&gt;To make it usable by applications, I implemented a TCP server on port 6379.&lt;/p&gt;

&lt;p&gt;The server:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;creates a socket&lt;/li&gt;
&lt;li&gt;listens for client connections&lt;/li&gt;
&lt;li&gt;parses incoming commands&lt;/li&gt;
&lt;li&gt;executes them&lt;/li&gt;
&lt;li&gt;returns responses&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The architecture separates responsibilities:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;server.c handles networking&lt;/li&gt;
&lt;li&gt;repl.c handles command parsing&lt;/li&gt;
&lt;li&gt;engine.c handles storage logic&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This separation makes the system easier to maintain and extend.&lt;/p&gt;

&lt;p&gt;RadishDB became a real database service.&lt;/p&gt;




&lt;h2&gt;
  
  
  Containerized Deployment with Docker
&lt;/h2&gt;

&lt;p&gt;To make deployment easier, I containerized RadishDB using Docker.&lt;/p&gt;

&lt;p&gt;The AOF file is stored in a Docker volume, which ensures data persists even if the container stops.&lt;/p&gt;

&lt;p&gt;This makes RadishDB portable and consistent across environments.&lt;/p&gt;

&lt;p&gt;It runs the same on local machines, servers, and CI pipelines.&lt;/p&gt;

&lt;p&gt;GitHub Actions automate builds and deployment.&lt;/p&gt;




&lt;h2&gt;
  
  
  Architecture Overview
&lt;/h2&gt;

&lt;p&gt;RadishDB consists of several components:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Engine&lt;/strong&gt;&lt;br&gt;
Handles in-memory storage, hash table, and command execution.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;AOF&lt;/strong&gt;&lt;br&gt;
Logs every write operation to disk for durability.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;AOF Rewrite&lt;/strong&gt;&lt;br&gt;
Compacts the log by writing only the current state.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Server&lt;/strong&gt;&lt;br&gt;
Handles TCP connections and client communication.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Docker&lt;/strong&gt;&lt;br&gt;
Provides consistent deployment and persistent storage.&lt;/p&gt;

&lt;p&gt;Each component has a clear responsibility.&lt;/p&gt;




&lt;h2&gt;
  
  
  What Building RadishDB Taught Me
&lt;/h2&gt;

&lt;p&gt;This project taught me a lot about how databases actually work.&lt;/p&gt;

&lt;p&gt;I learned:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;how crash recovery works&lt;/li&gt;
&lt;li&gt;how write-ahead logging ensures durability&lt;/li&gt;
&lt;li&gt;how hash tables work internally&lt;/li&gt;
&lt;li&gt;how to design binary file formats&lt;/li&gt;
&lt;li&gt;how to build TCP servers&lt;/li&gt;
&lt;li&gt;how memory management works in C&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;More importantly, it changed how I think about systems.&lt;/p&gt;

&lt;p&gt;Databases are not mysterious. They are carefully designed systems that follow strict rules to ensure data safety.&lt;/p&gt;

&lt;p&gt;Every write, every disk flush, and every recovery step matters.&lt;/p&gt;




&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;RadishDB started as a small experiment to understand database internals.&lt;/p&gt;

&lt;p&gt;It evolved into a crash-safe database engine with logging, snapshots, networking, and deployment support.&lt;/p&gt;

&lt;p&gt;The project helped me understand durability, persistence, and recovery in a practical way.&lt;/p&gt;

&lt;p&gt;Building it made databases feel less like black boxes and more like systems built from simple, reliable components.&lt;/p&gt;

&lt;p&gt;And that understanding was the real goal.&lt;/p&gt;

&lt;p&gt;GitHub: &lt;a href="https://github.com/pie-314/radishdb" rel="noopener noreferrer"&gt;https://github.com/pie-314/radishdb&lt;/a&gt;&lt;/p&gt;

</description>
      <category>c</category>
      <category>database</category>
      <category>systems</category>
      <category>backend</category>
    </item>
  </channel>
</rss>
