This week I had the opportunity to work on DiceDB which is a drop-in replacement of Redis and is much faster than it as well
Background
Redis is an open-source in-memory database, that can be used for the purpose of storing data, caching, or as a message broker
DiceDB improves on the benchmarks and also differs in two aspects from Redis i.e.
- It is multi-threaded
- Listens to SQL query and informs the client about the changes as soon as possible
My Contribution
Issue
I saw an issue that was related to auditing the documentation concerning the PFMERGE command
To briefly describe what the issue was :
- The documentation for the
PFMERGEcommand might have become stale, audit and fix it - As the tool is a drop-in replacement of
Redis, check if the functionality of the command matches the functionality mentioned in theRedisdocumentation - Make the documentation for the command consistent with the new proposed format that is to make appropriate use of headers, add terminal examples and use proper table format for arguments and error output types
My Approach
I've been wanting to learn Redis for a while, but have never taken the time to do so, this issue helped me learn more about how Redis operates and how DiceDB is different from it
To get myself acquainted with the technology, I went to the official documentation page of Redis, read through the documentation, different data types, and quickstart guide, and then ran a Docker container for the same on my local machine to experiment with it
After I was done exploring Redis, I moved on to get DiceDB working on my machine, most of the commands that I ran in DiceDB ran through the help of redis-cli tool, which allowed me to connect to the Docker instance of DiceDB running on my machine
PFMERGE
I was acquainted with common data types like, JSON, String, Sets and Lists
But this time around I came across a new Data Type, known as HyperLogLog, which estimates the cardinality of the elements in its set, this was a type of a probabilistic data structure
Any command that started with a PF probably dealt with the use of HyperLogLog Data structure, my task at hand was to see how PFMERGE was performing and how is it giving outputs and handling error
Some of the examples I ran here were
127.0.0.1:7379> PFADD hll1 "a" "b" "c"
(integer) 1
127.0.0.1:7379> PFADD hll2 "c" "d" "e"
(integer) 1
127.0.0.1:7379> PFADD hll3 "e" "f" "g"
(integer) 1
127.0.0.1:7379> PFMERGE hll_merged hll1 hll2 hll3
OK
127.0.0.1:7379> PFCOUNT hll_merged
(integer) 7
127.0.0.1:7379> PFADD hll_merged "x" "y" "z"
(integer) 1
127.0.0.1:7379> PFMERGE hll_merged hll1 hll2 hll3
OK
127.0.0.1:7379> PFCOUNT hll_merged
(integer) 7
127.0.0.1:7379> PFMERGE hll_merged hll1 hll2 non_existent_key
OK
127.0.0.1:7379> PFCOUNT hll_merged
(integer) 5
all of these examples gave me an overview of how this command worked
Making Changes
Once I was done with all of the pre-requisite stuff, I finally went on to audit the documentation and my changes were as follows
- Changed the terminal to reflect the correct port number being used to access the docker instance
- Added another example, in the example usage section demonstrating invalid usage of the command
- Converted the formatting of the
Return ValuesandParameterssections to a table format - Modified the expected behavior to match the functionality of how the command was working
A full descriptive view of my PR can be found here
Conclusion
Contributing to DiceDB this week provided me with valuable insights into in-memory databases and the HyperLogLog data structure. By auditing and updating the documentation for the PFMERGE command, I ensured that DiceDB's documentation remains accurate and user-friendly
Top comments (2)
Congrats on making a useful contribution to the project!
Thank you for the kind words!