Kannav Sethi

Posted on Oct 10, 2024

My Hacktoberfest Contribution To DiceDB

#database #redis #opensource #hacktoberfest

This week I had the opportunity to work on DiceDB which is a drop-in replacement of Redis and is much faster than it as well

Background

Redis is an open-source in-memory database, that can be used for the purpose of storing data, caching, or as a message broker

DiceDB improves on the benchmarks and also differs in two aspects from Redis i.e.

It is multi-threaded
Listens to SQL query and informs the client about the changes as soon as possible

My Contribution

Issue

I saw an issue that was related to auditing the documentation concerning the PFMERGE command

To briefly describe what the issue was :

The documentation for the PFMERGE command might have become stale, audit and fix it
As the tool is a drop-in replacement of Redis, check if the functionality of the command matches the functionality mentioned in the Redis documentation
Make the documentation for the command consistent with the new proposed format that is to make appropriate use of headers, add terminal examples and use proper table format for arguments and error output types

My Approach

I've been wanting to learn Redis for a while, but have never taken the time to do so, this issue helped me learn more about how Redis operates and how DiceDB is different from it

To get myself acquainted with the technology, I went to the official documentation page of Redis, read through the documentation, different data types, and quickstart guide, and then ran a Docker container for the same on my local machine to experiment with it

After I was done exploring Redis, I moved on to get DiceDB working on my machine, most of the commands that I ran in DiceDB ran through the help of redis-cli tool, which allowed me to connect to the Docker instance of DiceDB running on my machine

`PFMERGE`

I was acquainted with common data types like, JSON, String, Sets and Lists

But this time around I came across a new Data Type, known as HyperLogLog, which estimates the cardinality of the elements in its set, this was a type of a probabilistic data structure

Any command that started with a PF probably dealt with the use of HyperLogLog Data structure, my task at hand was to see how PFMERGE was performing and how is it giving outputs and handling error

Some of the examples I ran here were

127.0.0.1:7379> PFADD hll1 "a" "b" "c"
(integer) 1
127.0.0.1:7379> PFADD hll2 "c" "d" "e"
(integer) 1
127.0.0.1:7379> PFADD hll3 "e" "f" "g"
(integer) 1
127.0.0.1:7379> PFMERGE hll_merged hll1 hll2 hll3
OK
127.0.0.1:7379> PFCOUNT hll_merged
(integer) 7

127.0.0.1:7379> PFADD hll_merged "x" "y" "z"
(integer) 1
127.0.0.1:7379> PFMERGE hll_merged hll1 hll2 hll3
OK
127.0.0.1:7379> PFCOUNT hll_merged
(integer) 7

127.0.0.1:7379> PFMERGE hll_merged hll1 hll2 non_existent_key
OK
127.0.0.1:7379> PFCOUNT hll_merged
(integer) 5

all of these examples gave me an overview of how this command worked

Making Changes

Once I was done with all of the pre-requisite stuff, I finally went on to audit the documentation and my changes were as follows

Changed the terminal to reflect the correct port number being used to access the docker instance
Added another example, in the example usage section demonstrating invalid usage of the command
Converted the formatting of the Return Values and Parameters sections to a table format
Modified the expected behavior to match the functionality of how the command was working

A full descriptive view of my PR can be found here

Conclusion

Contributing to DiceDB this week provided me with valuable insights into in-memory databases and the HyperLogLog data structure. By auditing and updating the documentation for the PFMERGE command, I ensured that DiceDB's documentation remains accurate and user-friendly

Top comments (2)

Juan F Gonzalez • Oct 10 '24

Congrats on making a useful contribution to the project!

Kannav Sethi • Oct 10 '24

Thank you for the kind words!