Posted on Feb 24

How I Built a Crash-Safe Database Engine in C with Write-Ahead Logging and Snapshots

#c #database #systems #backend

Most developers use databases every day. Few actually know what happens when the power goes out mid-write, or when a system crashes halfway through saving data. Yet when the database restarts, everything is still there. That reliability isn’t magic. It comes from careful engineering.

I wanted to understand this at a deeper level, so I built RadishDB. It started as a simple in-memory key–value store in C. Over time, I added persistence, crash recovery, write-ahead logging, snapshots, TTL expiration, a TCP server, and Docker deployment.

The goal wasn’t to compete with Redis. It was to understand how systems like Redis actually work under the hood.

Why C?

Since RadishDB is fundamentally a storage engine, performance and predictability matter a lot. I wanted full control over memory and disk behavior.

C gives you:

direct control over memory
no garbage collector
predictable performance
minimal abstraction between code and hardware

When you're building a database, memory layout and disk writes are not abstract ideas. They are the system itself.

This is also why many real databases like Redis, SQLite, and PostgreSQL are written in C. The language doesn’t hide anything. If something goes wrong, you can usually see exactly why.

It also forces you to think carefully about every allocation, every pointer, and every write to disk.

The Core: In-Memory Storage and Hashtable

RadishDB stores all data in memory. This makes reads and writes extremely fast, since RAM access is much faster than disk access.

To organize data efficiently, I used a hash table.

Hash Tables

Hash tables allow fast lookup, insertion, and deletion, usually in constant time O(1).

When a key is inserted, RadishDB computes a hash and maps it to a bucket.

I used the djb2 hash function:

unsigned long hash(const char *str) {
  unsigned long hash = 5381;
  for (int i = 0; str[i] != '\0'; i++) {
    hash = hash * 33 + str[i];
  }
  return hash;
}

If multiple keys map to the same bucket, they are stored using separate chaining with a linked list.

This keeps operations fast even when collisions occur.

At this stage, RadishDB was fast, but fragile. Everything lived in memory. If the process crashed, all data was gone.

That led to the next problem.

The Problem: Surviving Crashes

An in-memory database is fast, but memory disappears when the process stops.

To solve this, I implemented Write-Ahead Logging (WAL) using an Append-Only File (AOF).

The idea is simple but powerful.

Every write operation is first written to disk before applying it to memory.

For example:

SET name alice
DEL name

These commands are appended to a log file.

If the database crashes, RadishDB reads this file during startup and replays the operations to rebuild memory.

The log becomes the source of truth. Memory becomes a reconstructed state.

This ensures durability.

AOF Rewrite: Log Compaction for Faster Recovery

One problem with append-only logs is that they grow forever.

For example:

SET x 1
SET x 2
SET x 3
SET x 4

Only the final value matters.

Similarly:

SET x 1
DEL x

The key no longer exists, but the log still contains both operations.

Over time, this slows startup and wastes disk space.

To fix this, RadishDB performs AOF rewrite.

Instead of keeping the full history, it writes only the current state into a new file.

The process works like this:

Create a temporary file
Write current database state
Flush to disk using fsync
Atomically replace the old file using rename

Rename is atomic on POSIX systems. This means even if a crash happens during rewrite, the database will always have a valid file.

This ensures both safety and efficiency.

Snapshots: Faster Startup with .rdbx

While AOF is great for durability, replaying a long log can take time.

To solve this, I implemented snapshots using a custom binary format called .rdbx.

A snapshot stores the current state of the database, not the history.

This makes it:

smaller
faster to load
easier to transfer

Snapshots are useful for backups and fast startup.

AOF ensures durability. Snapshots ensure speed and portability.

From Storage Engine to Database Server

At this point, RadishDB could store and recover data. But it wasn’t a real database server yet.

To make it usable by applications, I implemented a TCP server on port 6379.

The server:

creates a socket
listens for client connections
parses incoming commands
executes them
returns responses

The architecture separates responsibilities:

server.c handles networking
repl.c handles command parsing
engine.c handles storage logic

This separation makes the system easier to maintain and extend.

RadishDB became a real database service.

Containerized Deployment with Docker

To make deployment easier, I containerized RadishDB using Docker.

The AOF file is stored in a Docker volume, which ensures data persists even if the container stops.

This makes RadishDB portable and consistent across environments.

It runs the same on local machines, servers, and CI pipelines.

GitHub Actions automate builds and deployment.

Architecture Overview

RadishDB consists of several components:

Engine
Handles in-memory storage, hash table, and command execution.

AOF
Logs every write operation to disk for durability.

AOF Rewrite
Compacts the log by writing only the current state.

Server
Handles TCP connections and client communication.

Docker
Provides consistent deployment and persistent storage.

Each component has a clear responsibility.

What Building RadishDB Taught Me

This project taught me a lot about how databases actually work.

I learned:

how crash recovery works
how write-ahead logging ensures durability
how hash tables work internally
how to design binary file formats
how to build TCP servers
how memory management works in C

More importantly, it changed how I think about systems.

Databases are not mysterious. They are carefully designed systems that follow strict rules to ensure data safety.

Every write, every disk flush, and every recovery step matters.

Conclusion

RadishDB started as a small experiment to understand database internals.

It evolved into a crash-safe database engine with logging, snapshots, networking, and deployment support.

The project helped me understand durability, persistence, and recovery in a practical way.

Building it made databases feel less like black boxes and more like systems built from simple, reliable components.

And that understanding was the real goal.

GitHub: https://github.com/pie-314/radishdb

Top comments (1)

Pi • Feb 24

Built this to understand database internals deeply. Feedback is welcome.