DEV Community

Sandesh Ojha
Sandesh Ojha

Posted on

I built a Redis clone in C: From blocking I/O to epoll to crash safe persistence

I had been hearing about sockets, TCP connections, and networking for years without truly understanding what any of it meant at the code level. Building C-dis changed that. I learned the full socket lifecycle, what happens on both sides of a connection, how the kernel manages buffers, and how to handle non-blocking I/O states inside a manual event loop.What surprised me most was that a single-threaded architecture using epoll could handle tens of thousands of requests per second: no threads, no locking, no complexity. Architectural efficiency beats brute-force processing more often than people realize.The problem: One client at a timeMy first version was a simple blocking server. It worked, but the limitation became obvious immediately.

while (1) {    
    int client_fd = accept(server_fd, (struct sockaddr *)&client_addr, &client_len);    

    // Server is completely frozen here until this client finishes
    handle_client(client_fd, ht);    
    close(client_fd);
}
Enter fullscreen mode Exit fullscreen mode


c
While handle_client is running, the server cannot accept new connections, cannot read from other clients, and cannot do anything else. One client holds the entire server hostage. A second client connecting during this time sits in the kernel's backlog queue, waiting silently with no feedback. In production, this means a slow client degrades the experience for every other user on the server.The fix isn't to add threads. The fix is to stop blocking.The data engine: Building a hash table from scratchBefore writing the network layer, I needed somewhere to store data. The obvious choice was a simple array with linear search, but that is O(n) lookup. For a database, that is unacceptable.I looked at how Redis 2.0 approached this. Hash tables. O(1) average lookup, regardless of how many keys are stored. I implemented one from scratch using the djb2 hash function:

static unsigned int hash(const char *key) {    
    unsigned long int hashval = 5381;    
    int c;    

    while ((c = *key++))        
        hashval = ((hashval << 5) + hashval) + c;    

    return hashval & (HASH_SIZE - 1);
}
Enter fullscreen mode Exit fullscreen mode

The (hashval << 5) + hashval is equivalent to hashval * 33, a multiplier chosen because it distributes keys evenly across buckets with minimal clustering. The & (HASH_SIZE - 1) replaces modulo with a bitwise AND, which is faster when the table size is a power of two.When two keys hash to the same bucket (a collision) they chain together in a linked list.

Lookup walks the chain comparing keys until it finds a match or hits NULL. With a good hash function and a reasonable load factor, chains stay short and lookup stays effectively O(1).select vs epoll: The architecture that makes this fast to handle multiple clients without threads, I needed I/O multiplexing: a way to ask the kernel "which of these sockets has data ready?" and act only on those.I implemented select first, then replaced it with epoll. The difference taught me more about systems design than any other part of this project.The problem with select:Every time you call select, you rebuild the entire watch list from scratch and hand it to the kernel. The kernel scans every file descriptor in the set (whether active or idle) and returns a modified bitmask showing which ones are ready. With 1,000 connected clients and only 1 active, the kernel scans all 1,000 every time. That is O(n) per event loop iteration.

while (1) {    
    fd_set readfds;    
    FD_ZERO(&readfds);    
    FD_SET(server_fd, &readfds);    

    for (int i = 0; i < MAX_CLIENTS; i++)        
        if (clients[i] != -1)            
            FD_SET(clients[i], &readfds);  // rebuild every iteration    

    select(max_fd + 1, &readfds, NULL, NULL, NULL);    
    // now scan all fds again to find which ones triggered
}
Enter fullscreen mode Exit fullscreen mode

How epoll fixes it:epoll moves the interest list into the kernel permanently. You register a file descriptor once with epoll_ctl. The kernel maintains it. When you call epoll_wait, it returns only the file descriptors that are actually ready, not all of them.
C

// register once
epoll_ctl(epfd, EPOLL_CTL_ADD, client_fd, &ev);

// wait: returns only ready fds
int n = epoll_wait(epfd, events, 64, -1);
for (int i = 0; i < n; i++) {    
    // every event here is guaranteed ready    
    handle(events[i].data.fd);
}
Enter fullscreen mode Exit fullscreen mode

This is a synchronous benchmark. Each command waits for a response before sending the next. It measures worst case single client throughput. Pipelining multiple commands per connection would push this significantly higher.Zero memory leaks across 40,000 operations verified with valgrind --leak-check=full. Every malloc in the hash table has a matching free in ht_destroy.What is nextThe next project is a container runtime: implementing Linux namespaces, cgroups, and pivot_root in C to build a minimal docker run from scratch. Understanding how containers actually work at the kernel level, not just how to use them.The code for C-dis is on GitHub. If you are building something similar or have questions about the epoll implementation, I am happy to talk through it in the comments below!

Top comments (0)