MomoDB: Building a C++ Key-Value Store to Learn Database Internals (#0)

#cpp #database #showdev #learning

I am working on a key-value store using modern C++. I'm calling it MomoDB, named after momo (the dumpling).

The primary goal of the exercise is to learn database internals, as well as polish up my C++, especially C++20.

I've made some progress on the database so far, and this initial post is to primarily cover my progress, lessons learned, and the future work.

What I've built so far

The database system currently has:

Working TCP server
Write ahead logging
Ability to rebuild from the log file
Support for commands: SET, GET, DEL and EXIT
- DEL currently just overwrites the value with an empty string, which is not ideal, but works for now. Proper tombstone value needs to be implemented for the future.

TCP server

This currently works on a hardcoded port 9001 (it's over 9000!). It opens up the port, listens to it, and accepts connections.
One issue plaguing this currently is that it closes connections after execution of every command. This is holding back the system because the TCP handshakes (SYN, SYN-ACK, ACK) and teardown take up a small, but significant amount of time. Fixing this alone could boost performance and throughput significantly.

Write ahead logging

This is a brilliant, but common trick used in the database world to ensure durability. Essentially, you log every operation before you perform it so that you can recover in case of crashes and get back up to same state. MomoDB currently has full WAL capabilities: it generates an append only log, and can also rebuild the keystore correctly based on the log.

A sample log line currently looks like:

1769273343546937112:1:key_9998:val_9998

The format is:

timestamp:operation_type:key:value

This is deliberately simple for now, to ensure it remains easily debuggable and human-readable.
There are obvious limitations to using : as delimiter, for example, JSON cannot be correctly stored and parsed currently, as well as potential performance improvements, but that's something I'll be working on.

A small note about the rebuild functionality: the keystore functions have a flag added to them on whether or not they should log their operations. This flag is set to false when rebuilding the log. This prevents the keystore from cluttering the log file every time the system starts up.

Internal data structure

I used map initially for the internal key-value store, but later switched to unordered_map. Using map gives you a sorted order to iterate through the keys, but with higher per-element memory cost. However, the average time complexity is O(log n) for insertion. unordered_map on the other hand will provide O(1) insertions on average, and O(n) in the worst case (assuming you have a lot of collisions), but it does not maintain the order of insertion.

For this use case, insertion and read speeds seem to be more critical going forward, so I switched over to unordered_map for now, but the exploration for better data structures continues.

Performance baseline

I benchmarked the server with a small Python script that makes sequential requests from a single client (connecting, sending a command, getting a response, disconnecting, repeat), and here's the performance I currently have:

Successfully completed: 10000/10000
Total time: 1.07 seconds
Performance: 9347.80 operations/sec

This is considerably slower than something like Redis (which can do over 100,000+ operations/sec!), but I'm sure this can be improved drastically with networking improvements and other optimizations. I'm primarily looking into implementing persistent connections first, and then going for epoll or io_uring to speed things up.

Key takeaways

Persistence is difficult. Getting WAL working properly took some effort, and there are still potential avenues for improvement, even discounting the log format.
Networking can be a huge bottleneck. I assumed the performance would come largely from the choice of data structure used, but the networking is holding back even the map data structure. And I'm still testing locally!
C++20 features are pretty amazing! I've been using string_view a lot, and it offers a wonderful way to inspect and handle strings without copying! I also work using Go and Rust frequently, and C++ tends to get bashed a lot, but modern C++ is really fun and interesting, and a really nice break from using Python, which tends to hide a lot of internal details. There's also std::format now, which allows you to do this:

auto display_string = std::format("{}:{}", key, value);
std::cout << display_string << std::endl;

What's next?

I primarily need to fix the networking to get a major performance boost. That will be my primary focus going forward. It's forcing me to go through low level understanding of networking and how it works on Linux systems, and it's both frustrating and fun in its own way.

Then come changes to the logging format, as well as improvements to the way the keystore is rebuilt.

And then finally, the absence of tests is bothering me. So I plan on adding some tests in the near future, once these two have been resolved.

AI usage

I am consciously avoiding the use of AI in this project as far as possible, especially in the code. This is helping me learn and build an understanding of concepts, which would never happen if I handed it over to AI tools. My progress is obviously slower. I would've probably been able to finish the entire project much faster, but would I even be skilled enough to understand what the AI generated and fix the issues? I don't think so.

You can find the repository here: https://github.com/dhruva71/MomoDB

If you're working on something similar or have suggestions for the networking layer, I'd love to hear about it in the comments. I'll be posting updates as I tackle persistent connections and the log format redesign.