hash-anu

Posted on Mar 5

SNKV- Key value store based on sqlite b-tree engine

#keyvalue #sqlite #algorithms

There are many key--value stores available, such as RocksDB, LevelDB,
LMDB, etc. However, these often consume quite a lot of memory. When I
searched on Reddit, I found that whenever someone asks about using a
key--value store, people frequently suggest using SQLite.

Example discussion:
https://www.reddit.com/r/rust/comments/1ls5ynr/recommend_a\_keyvalue_store/

Even though SQLite is a SQL database, people often use it as a
key--value store. This made me want to develop my own key--value store
that uses SQLite's storage engine instead of the entire SQLite stack. AI
really helped me understand the lower layers of SQLite, such as the
B-tree, pager, and OS layers.

This led to the creation of:

https://github.com/hash-anu/snkv

Architecture

The architecture of SNKV is very simple:

kvstore layer -> b-tree layer -> pager -> os

The lower layers are already battle-tested and production-ready.

My task was to develop kvstore.c so that it consumes the APIs of the
B-tree layer while the lower layers work without any issues. In kvstore, I mostly used sqlite utility functions so that there are no third party library dependencies and every thing can be encapsulated into single file.

How to Use SNKV

C / C++

If you have a C/C++ project and want to use SNKV, it's very simple.

Generate snkv.h:

make snkv.h

Or download the ZIP from the 0.4.0 release.

Since it is a single-header kvstore, you can directly include it in
your project. In one of your .c or .cpp files add:

#define SNKV_IMPLEMENTATION

This ensures the implementation is compiled into your executable.

You can also review the API specification:

https://github.com/hash-anu/snkv/blob/master/docs/api/API_SPECIFICATION.md

Python

Python developers can simply install it using pip:

pip install snkv

Documentation for Python APIs:

https://github.com/hash-anu/snkv/blob/master/docs/python_api/API.md

Optimizations in KVStore

The entire SQL layer is bypassed. Data is stored in the format:

key_len | key | value

inside the B-tree.

This format allows fetching values based on key prefixes quickly in
O(log n) time.
Each table maintains a cached read cursor, improving the
performance of read operations such as get, exists, etc.
By default, a read transaction starts immediately after
kvstore_open. This reduces the time needed to find mxFrame
during read operations.

For write operations:

The read transaction is committed.
A write transaction runs and commits.
The read transaction starts again.

Examples

Examples demonstrating different SNKV use cases:

https://github.com/hash-anu/snkv/tree/master/python/examples
https://github.com/hash-anu/snkv/tree/master/examples

Crash Testing

A crash test was implemented that:

Writes deterministic key--value pairs into a 10 GB WAL-mode database
Forcefully kills the writer using SIGKILL during active writes
Verifies on restart that:

Results:

Every committed transaction exists with byte-exact values
No partial transactions are visible
The database shows zero corruption

Eco system support

Since storage engine of SNKV is same as Sqlite, tools which are relying on lower layers can be directly used with SNKV, I have verified tool such as

LiteFs
Wal based backup tools
Rollback journal tools

Next Step

Try SNKV and experiment with it.

If you encounter any issues or have suggestions, please open an issue:

https://github.com/hash-anu/snkv/issues

Feedback and thoughts are welcome.

DEV Community