DEV Community

Cover image for I created a nosql DB using rust
Arindam Roy
Arindam Roy

Posted on

I created a nosql DB using rust

Well when i say i created a Nosql db using rust i mean kind of. like people ask me why would you do that i always ask myself why not someone even though making js in 10 days would be good idea so i really don't think it would be consider as the most bad idea.

so a Nosql db let's talk about this first so what is database is from our computer science class we all know database is a tool for storing and accessing the data .

Now a database is consist of three many things it can have more components but this are the basic one

visual presentation of my db
1) Data model (representation of the way we define models for db)
2) Storage Engine (responsible for storing , clearing disk and memory to access or delete data)
3) query engine (responsible for the talking to the db)

Ok now let's define the document model first . I'm designing this after the mongodb document model which was made using c++ so it supports much more data types but we will make a very simple implementation of this.

#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct Document {
    id: Uuid,
    created_at: DateTime<Utc>,
    updated_at: DateTime<Utc>,
    data: HashMap<String, Value>,
}
// document model

#[derive(Debug, Clone, Serialize, Deserialize, PartialEq)]
pub enum Value {
    Null,
    Boolean(bool),
    Integer(i64),
    Float(f64),
    String(String),
    Array(Vec<Value>),
    Object(HashMap<String, Value>),
    Date(DateTime<Utc>),
}

// basically we can put any kind of data into our db

Enter fullscreen mode Exit fullscreen mode

so now we know how is our data gonna look like once it gets stored but how to store it is a challenge because remember in a database the most important part is saving something and retrieving something. we analyzed any database performance by it's read and write speed so it's absolutely crucial how you store your data .

so i took a very simple approach here i just saved the data into the disk and kept a cache in the ram for faster retrieval.

#[derive(Debug, Clone, Serialize, Deserialize)]
struct BlockMetadata {
    id: String,
    size: usize,
    next_block: Option<u64>,
}

#[derive(Debug, Clone, Serialize, Deserialize)]
struct Block {
    metadata: BlockMetadata,
    data: Vec<u8>,
}
// we use block to store the  data in disk

struct DiskManager {
    file: File,
    free_blocks: VecDeque<u64>,
}

struct Cache {
    blocks: HashMap<u64, Block>,
    lru: VecDeque<u64>,
}

//  A deque tracking the least recently used (LRU) blocks for cache eviction.

pub struct StorageEngine {
    disk: DiskManager,
    cache: Cache,
    index: HashMap<String, u64>, // Document ID to first block number
}
// is the core structure that ties everything together

Enter fullscreen mode Exit fullscreen mode

Now we know how to store our data and after storing how it look like now we need to access the data for that we'll create a query engine of our own. I thought of creating a new template language for accessing the data then i though we are far behind the days of useless query language template like AWS velocity template language like honestly who though that was a good idea ?

Anyway for our simple program i just put a bunch of operations in an enum and wallah

#[derive(Debug, Clone)]
pub enum Operator {
    Eq,
    Ne,
    Gt,
    Lt,
    Gte,
    Lte,
    In,
    Nin,
}
// very basic euals , not equals , greater than ,less than, grater than equal , Inside the array , not inside the array

#[derive(Debug, Clone)]
pub struct Condition {
    field: String,
    operator: Operator,
    value: Value,
}

#[derive(Debug, Clone)]
pub struct Query {
    conditions: Vec<Condition>,
}
Enter fullscreen mode Exit fullscreen mode

Now for the final part to test out your own db . I'm not gonna lie it's not a good experience i still haven't created an library for talking to this db easily so you have to set some stuff up.

first create a function for initializing a new document

fn create_test_document(name: &str, age: i64, city: &str) -> Document {
        let mut doc = Document::new();
        doc.insert("name".to_string(), Value::String(name.to_string()));
        doc.insert("age".to_string(), Value::Integer(age));
        doc.insert("city".to_string(), Value::String(city.to_string()));
        doc
    }

Enter fullscreen mode Exit fullscreen mode

then you can use this function to add as many data as you want


        let db_path = Path::new("test.bin");
        let mut storage = StorageEngine::new(db_path).unwrap();

        let doc1 = create_test_document("Alice", 30, "New York");
        let doc2 = create_test_document("Bob", 25, "San Francisco");
        let doc3 = create_test_document("Charlie", 35, "New York");

        storage
            .write(
                doc1.id().to_string().as_str(),
                &serde_json::to_vec(&doc1).unwrap(),
            )
            .expect("Failed to write doc1");
        storage
            .write(
                doc2.id().to_string().as_str(),
                &serde_json::to_vec(&doc2).unwrap(),
            )
            .expect("Failed to write doc2");
        storage
            .write(
                doc3.id().to_string().as_str(),
                &serde_json::to_vec(&doc3).unwrap(),
            )
            .expect("Failed to write doc3");

Enter fullscreen mode Exit fullscreen mode

Just judging by the performance test i ran on my own computer I'm hopeful this can be an actual thing someday . even though i only created this as a fun project i still have so many things planned we can do so if you guys are interested and want to be a part of the project just create a PR in github.

This was made solely because of learning and understanding so if you have any suggestion or complains you can choose to do so in github.

here is the github project link
Github

thank you for sticking to the very end.

Top comments (0)