DEV Community: Dragonfly

Dragonfly Cloud: Now Available in AWS Marketplace

Dragonfly — Tue, 26 Nov 2024 16:44:25 +0000

Today we're excited to announce that Dragonfly Cloud is available in the AWS Marketplace.

Dragonfly Cloud has always enabled customers to deploy data stores in their preferred AWS region and connect to existing AWS infrastructure using VPC peering. Now, you can streamline billing by purchasing Dragonfly Cloud in-memory data stores directly through the AWS Marketplace. Any Dragonfly Cloud usage will be included in your monthly AWS bill, allowing you to use your AWS account credits as well.

To purchase Dragonfly Cloud through the AWS Marketplace, simply visit the listing page and click 'Subscribe'. You will be redirected to create an account on Dragonfly Cloud,and once that is complete, all billing will be integrated with the rest of your monthly AWS bill.

Dragonfly Cloud pricing in the AWS Marketplace is the same as
our standard pricing.

Case Study: Migrating from Redis to Dragonfly to Scale IoT Infrastructure

Dragonfly — Wed, 09 Oct 2024 08:56:19 +0000

Intro

Over the past few decades, globalization has brought societies together faster than ever throughout history.
Social media, of course, has played a starring role in this process.
But even before the internet seeped into most aspects of our lives, the major driving force behind globalization had already taken hold: transportation.

While the internet has allowed us to seamlessly transport ideas, we still rely heavily on vehicles for people and goods.
This is where SmartGPS steps up as the fastest-growing vehicle tracking and IoT startup on the market.
With over 300,000 devices connected in Brazil and six other Latin American countries so far, SmartGPS is building a dynamic ecosystem that delivers real-time tracking and data insights to clients.
As they amass larger and larger quantities of data, they hope to contribute to sustainable mobility, helping cities evolve intelligently and reduce traffic congestion and pollution.

To do this well, SmartGPS needs to be able to quickly process large volumes of data in real-time.
To reduce the load on the primary database, retrieve data quickly, and increase throughput under highly concurrent requests,
it was clear to SmartGPS CTO, Eduardo Malpeli, that a strong in-memory data store would need to be employed.
While he and his team started with Redis Labs, as their business scaled, he realized quickly that the existing solution was limiting their growth.
After evaluating Dragonfly, they discovered that it removed the limitations traditionally associated with in-memory data stores.

Redis Provided Speed But Failed to Scale

Redis was not as performant as we thought it would be.

According to Malpeli, because of how it records, stores, and flushes data, SmartGPS needed a trustworthy in-memory data store.
Providing real-time tracking and insights requires that they collect, record, and create a key for the data from vehicles to keep it in order and ready to retrieve instantly.

While they do occasionally flush this data to a primary database, storing and retrieving directly from it for every action would not allow them the speed they need to provide their service to their customers.

It made sense, therefore, to use an in-memory data store as part of their data infrastructure stack.
Known for its incredible speed, SmartGPS went for Redis via Redis Cloud for their primary in-memory data store.

'Redis Struggled to Keep Up During Peak Traffic Hours'

Unfortunately, while Redis provided the speed Malpeli and his team needed, it proved to impede growth. "Redis was not as performant as we thought it would be," he said. As SmartGPS' customer base grew, the volume of data soared, and their traffic patterns became increasingly erratic, Redis struggled to keep up during peak traffic hours, causing performance hiccups.

After this caused a significant amount of data loss, Malpeli knew SmartGPS needed to find an in-memory data store alternative that could provide the speed of Redis as well as the scale they required.

Dragonfly Was the Better Choice

'With Dragonfly, our service was kept up and running smoothly!'

Desperate for another option, the SmartGPS team stumbled across several options, Dragonfly among them. "Dragonfly was the best because it is compatible with Redis and is the most mature product among the choices," Malpeli said.

The easy migration from Redis and the trustworthiness of the technology, along with the promise of better scale, meant that they had found their alternative.

During the evaluation process, SmartGPS conducted performance benchmarks by replicating their traffic patterns in a Dragonfly Cloud trial account, ensuring the solution could handle their data traffic with ease.

The result was impressive—Dragonfly met and exceeded expectations, enabling SmartGPS to manage their peak traffic loads effectively without compromising performance or data integrity.
"With Dragonfly, our service was kept up and running smoothly!"

'Changing the Data Store Endpoint'

Once they felt comfortable with the stability of their data, it was decided that they would complete the migration and move Dragonfly into production. Malpeli mentioned specifically how easy the process was. They "...didn't even have to change any code on our end!"
All they needed to do was change the endpoint that their production code was pointing to, "...and it just worked!"

Malpeli mentioned that the two biggest selling points while evaluating Dragonfly Cloud were compatibility and reliability.
They were able to seamlessly integrate Dragonfly with their existing Redis APIs, providing a familiar framework for their team of developers.

Their own service's reliability increased dramatically. "With Redis, we would have hiccups that dramatically impacted performance every week or two, we've had none since we migrated to Dragonfly."

Success with Dragonfly

'We experimented with multiple options, but nothing came close to Dragonfly.'

Before migrating to Dragonfly, SmartGPS resorted to throttling new customer onboarding to accommodate their existing infrastructure's struggles with handling all the new devices.

Now, with Dragonfly supporting speed and scale, they are able to onboard customers confidently and grow without fear of their service going down.

Malpeli left us with, "I would encourage any developer exploring alternatives to Redis to [give Dragonfly a try (https://dragonflydb.cloud/). Test it, set your own benchmarks, and see for yourself. We experimented with multiple options, but nothing came close to Dragonfly.

The ease of implementation and performance demonstrated by Dragonfly were unparalleled."

A Preview of Dragonfly Cluster

Dragonfly — Tue, 08 Oct 2024 19:54:36 +0000

Introduction

Dragonfly excels at high performance and vertical scaling, making it a top choice for demanding modern data workloads.
Soon, Dragonfly Cluster¹ will offer horizontal scalability as well, expanding its capabilities even further.
In this article, I want to show you how to run Dragonfly Cluster and provide a concise overview of the Dragonfly Cluster internal processes.

Dragonfly Cluster Overview

Like Redis Cluster, Dragonfly Cluster achieves horizontal scalability through sharding.
A cluster comprises one or more shards, each consisting of a master node (primary) and zero or more replicas.
Data is distributed across shards using a slot-based approach.

The hashing space in the cluster is divided into 16,384 slots.
Each key is hashed into a slot. The hash slot is computed from the key name using the CRC16 algorithm and then taking the modulus of 16,384.
Each node in a cluster is responsible for a subset of these slots.

Dragonfly Cluster supports dynamic rebalancing without downtime.
Hash slots can be seamlessly migrated from one node to another, allowing for the addition, removal, or resizing of nodes without interrupting operations.

Dragonfly Cluster supports multi-key operations as long as all involved keys (in a multi-key command, in a transaction, or in a Lua script) reside within the same hash slot.
To ensure this, Dragonfly employs hash tags.
With hash tags, the system calculates the hash slot based solely on the content within curly braces {} of a key.
This mechanism allows users to explicitly control key distribution across shards.

If a client requests keys from a node, but those keys belong to a hash slot managed by a different node, the client receives a -MOVED redirection error.
This ensures that the client can always find the correct node handling the requested keys.

Dragonfly only provides a server but not a control plane to manage cluster deployments.
Node health monitoring, automatic failovers, and slot redistribution are out of the scope of Dragonfly backend functionality
and will be provided as part of the Dragonfly Cloud service.

Dragonfly Cluster offers seamless migration for existing Redis Cluster clients.
It fully adheres to Redis Cluster's client-facing behavior, ensuring zero code changes for applications.
However, Dragonfly takes a fundamentally different approach to cluster management.

Unlike Redis Cluster's distributed consensus model, Dragonfly adopts a centralized management strategy.
Nodes operate independently, without direct communication or shared state.
This design choice provides a single source of truth and enhances simplicity, reliability, and performance.

Cluster Modes

Dragonfly has two cluster modes.

Emulated Cluster Mode (which can be enabled by --cluster_mode=emulated) is fully compatible with the stand-alone mode,
supporting SELECT and multi-key operations while also providing cluster commands like CLUSTER SHARDS.
It functions as a single-node Dragonfly instance and does not include horizontal scaling, resharding, or certain advanced cluster features.
This mode is ideal for:

Development & Testing Environments: Provide a simplified setup for rapid iteration.
Migration Phases: Serve as an interim solution when transitioning from a stand-alone to a clustered setup.
Resource-Constrained Scenarios: A Dragonfly instance in emulated cluster mode can optimize resource utilization by acting as a replica for multiple shards, allowing a single node to replicate several cluster nodes efficiently.

Multi-Node Cluster Mode (which can be enabled by --cluster_mode=yes) is the Dragonfly Cluster we are talking about in this blog post.
It has certain limitations compared to stand-alone or emulated modes:

The SELECT command is not permitted.
All keys in a multi-key operation (i.e., multi-key commands, transactions, and Lua scripts) must belong to the same slot. Otherwise, a CROSSSLOT error is returned.

Dragonfly supports some CLUSTER commands for compatibility with Redis Cluster clients. These commands primarily provide informational data about the cluster setup:

CLUSTER HELP: Lists available CLUSTER commands.
CLUSTER MYID: Returns the node ID.
CLUSTER SHARDS: Displays information about cluster shards.
CLUSTER SLOTS: Lists all slots and their associated nodes.
CLUSTER NODES: Shows information about all nodes in the cluster.
CLUSTER INFO: Provides general information about the cluster.

Dragonfly Cluster Management

`DFLYCLUSTER` Commands

DFLYCLUSTER commands are specific to Dragonfly and offer more advanced cluster management capabilities:

DFLYCLUSTER CONFIG: Manages node roles, slot assignments, and migration processes.
DFLYCLUSTER GETSLOTINFO: Provides in-depth statistics about slot utilization, including key count, memory usage, and read/write operations.
DFLYCLUSTER FLUSHSLOTS: Efficiently clears data from specific slots.
DFLYCLUSTER SLOT-MIGRATION-STATUS: Monitors the progress of slot migrations, indicating the current state and completion status.

Cluster Creation

To begin building a Dragonfly cluster, we'll start by launching two separate Dragonfly instances in cluster mode.
Each instance will have a unique ID.

$> ./dragonfly --cluster_mode=yes --admin_port=31001 --port 30001
$> ./dragonfly --cluster_mode=yes --admin_port=31002 --port 30002

Once the instances are running, we need to retrieve their unique IDs by using the following commands:

$> redis-cli -p 31001 CLUSTER MYID
$> redis-cli -p 31002 CLUSTER MYID

In my case, the two unique IDs have prefixes 97486c... and 728cf2... respectively.
Now we can create a cluster config in JSON format (as a string) by plugging in the unique IDs and IP addresses of the two nodes above.

[
  {
    "slot_ranges": [
      {
        "start": 0,
        "end": 999
      }
    ],
    "master": {
      "id": "97486c9d7e0507e1edb2dfba4655224d5b61c5e2",
      "ip": "localhost",
      "port": 30001
    },
    "replicas": []
  },
  {
    "slot_ranges": [
      {
        "start": 1000,
        "end": 16383
      }
    ],
    "master": {
      "id": "728cf25ecd4d1230805754ff98939321d72d23ef",
      "ip": "localhost",
      "port": 30002
    },
    "replicas": []
  }
]

By using the DFLYCLUSTER CONFIG command, the JSON string is sent to both our nodes and the Dragonfly Cluster is created.

Slot Migration

Slot migration is a critical operation that can potentially lead to data loss if not executed carefully.
It's essential for adjusting cluster configuration to meet changing demands.
Dragonfly supports concurrent slot migrations, but only one migration can be in progress between any two specific nodes at a given time.
This means multiple migrations can be initiated simultaneously across different node pairs within a cluster.
To initiate and complete a slot migration, use the DFLYCLUSTER CONFIG command, specifying an additional migrations field in the JSON configuration.
In the example below, I decided to move slots [1000, 8000] to another node, and here's how the JSON configuration string looks:

[
  {
    "slot_ranges": [
      {
        "start": 0,
        "end": 999
      }
    ],
    "master": {
      "id": "97486c9d7e0507e1edb2dfba4655224d5b61c5e2",
      "ip": "localhost",
      "port": 30001
    },
    "replicas": []
  },
  {
    "slot_ranges": [
      {
        "start": 1000,
        "end": 16383
      }
    ],
    "master": {
      "id": "728cf25ecd4d1230805754ff98939321d72d23ef",
      "ip": "localhost",
      "port": 30002
    },
    "replicas": [],
    "migrations": [
      {
        "slot_ranges": [
          {
            "start": 1000,
            "end": 8000
          }
        ],
        "node_id": "97486c9d7e0507e1edb2dfba4655224d5b61c5e2",
        "ip": "localhost",
        "port": 31001
      }
    ]
  }
]

To maintain cluster consistency during slot migrations, the DFLYCLUSTER CONFIG command is propagated to all nodes, even those not directly involved.
This ensures that all nodes have an up-to-date view of the cluster configuration, preventing inconsistencies that might arise from concurrent migration processes or failures.
To monitor the progress of a slot migration, use the following command:

$> redis-cli -p 31002 DFLYCLUSTER SLOT-MIGRATION-STATUS

Once the migration status shows FINISHED, the new cluster configuration (with updated slot_ranges) can be applied to all nodes:

[
  {
    "slot_ranges": [
      {
        "start": 0,
        "end": 8000
      }
    ],
    "master": {
      "id": "97486c9d7e0507e1edb2dfba4655224d5b61c5e2",
      "ip": "localhost",
      "port": 30001
    },
    "replicas": []
  },
  {
    "slot_ranges": [
      {
        "start": 8001,
        "end": 16383
      }
    ],
    "master": {
      "id": "728cf25ecd4d1230805754ff98939321d72d23ef",
      "ip": "localhost",
      "port": 30002
    },
    "replicas": []
  }
]

Upon applying the new cluster configuration, data from migrated slots is permanently erased from the source node.
Migrations can be canceled by removing the migration field from the configuration while preserving slot assignments.
If the updated configuration isn't applied promptly after migration completion, the cluster enters a transitional state where nodes have inconsistent slot information.
Clients may initially be redirected to the source node for migrated slots, but subsequent requests to the source node will be correctly routed to the target node.
Although this temporary inconsistency exists, it doesn't compromise data integrity.

Replicas

Dragonfly replication is configured using the standard REPLICAOF command, identical to non-clustered setups.
Despite replication details being included in the cluster configuration, replicas function independently, copying data directly from the master node.
This means replicas replicate all data from the master mode, regardless of slot assignment.

Slot Migration Process Under the Hood

Dragonfly utilizes a set of internal commands, prefixed with DFLYMIGRATE, to manage the slot migration process.
The slot migration process involves several carefully coordinated steps to ensure data integrity and seamless transitions between nodes.

Initiating Migration: The process begins with the DFLYCLUSTER CONFIG command sent to the source and target nodes to configure migration parameters.
Preparing the Target Node: The source node sends DFLYMIGRATE INIT [SOURCE_NODE_ID, SHARDS_NUM, SLOT_RANGES] to the target node. The target node responds OK, indicating it is ready to receive data.
Setting Up Data Transfer: For each storage thread-shard (a data segment handled by a single thread), the source node sends DFLYMIGRATE FLOW [SOURCE_NODE_ID, FLOW_ID]. Each DFLYMIGRATE FLOW command sets up a connection for data transfer, confirmed by an OK response.
Transferring Data: Using the established connections from the DFLYMIGRATE FLOW commands, the source node serializes and transfers data to the target node.
Finalizing Migration: After the data transfer, the migrated slots on the source node are blocked to prevent further data changes. The source node then sends a finalization request for each FLOW connection to conclude the data transfer.
Completing the Process: The source node issues DFLYMIGRATE ACK [SOURCE_NODE_ID, ATTEMPT_ID] to the target node to finalize the entire migration. The target node responds with ATTEMPT_ID, completing the migration. The ATTEMPT_ID is used to handle errors that may arise during the finalization process.

While most steps in the migration process are straightforward, step 4 above requires a more detailed explanation due to its complexity.
In Dragonfly, there are two sources of data that need to be sent to the target node: the snapshot and the journal.
We will dive deeper into these two sources of data below.

Snapshot Creation

To create a snapshot, Dragonfly iterates through each storage shard, serializing the data.
In the absence of write requests, this is a linear process where each bucket is serialized one by one.
Periodic pauses are incorporated to allow the system to process new requests, ensuring minimal disruption.

The process becomes more complex when write requests occur during serialization.
Unlike Redis, which uses a fork mechanism to prevent data changes during serialization,
Dragonfly employs a more sophisticated mechanism, incorporating versioning and pre-update hooks, to create snapshots without spiking the memory usage or causing latency issues.

Journal Serialization

While handling the snapshot, Dragonfly also manages a journal, which logs all recent write operations.
These journal entries are serialized and sent to the target node along with the snapshot data.

Let's look at a small example to illustrate the process:

There are several data entries: A, B, C, and D.
Data entries A and B are serialized and sent to the target node.
A new MSET command is issued by the client, updating B and D.
Because B is already serialized previously, we do nothing with it for now.
Data entry D is serialized and sent to the target node.
The journal gets the update about B and D, serializes it, and sends it to the target node.
Finally, data entry C is serialized and sent to the target node.

By following this process, Dragonfly ensures that the target node receives a consistent version of the source node's data,
including all recent write operations during the slot migration process.

Conclusion

Dragonfly Cluster is a powerful addition to the Dragonfly ecosystem, offering horizontal scalability for even the most demanding workloads.
Modern servers can come equipped with over a hundred cores and several hundred gigabytes of memory.
While Dragonfly Cluster offers significant scalability advancements, vertical scaling should be prioritized if feasible.
Therefore, it is advisable to evaluate the potential for vertical scaling before implementing a cluster.
If you're uncertain about future vertical scaling needs, you can start with an emulated cluster and switch to a real cluster as your requirements grow.

In the meantime, if you are curious to see how Dragonfly can scale with your needs and workloads, the easiest way to get started is by using the cloud service backed by the Dragonfly core team.
Try Dragonfly Cloud today and experience the power of seamless scaling firsthand!

At the time of writing, Dragonfly Cluster is not officially released yet.
However, many features and cluster-related commands described in this article are already available in the Dragonfly main branch.
We are actively testing and improving this amazing feature. ↩

Dragonfly's New Sorted Set Implementation

Dragonfly — Fri, 23 Aug 2024 14:03:07 +0000

Background

Redis offers a plethora of data types to cater to various use cases. Among these, [sorted set (https://www.dragonflydb.io/docs/category/sorted-sets) stands out as a unique and powerful data type. Unlike traditional hash tables in Redis, which store unordered collections of strings, sorted sets maintain their elements in ascending order based on a score or lexicographic order. This inherent ordering capability, combined with the flexibility of non-repeating members, makes sorted sets an invaluable tool for tasks like leaderboards, time-series data, and priority queues.

In the process of working closely with our community and customers to constantly improve Dragonfly, we identified some inefficiencies with the Redis implementation of sorted sets.

We decided the best way to address this was to rebuild sorted sets from scratch. In this article, I will explain how we went about doing this.

TL;DR

We built a new sorted set implementation based on the B+ tree that significantly reduces memory and improves performance; see benchmark results.
Starting with Dragonfly v1.9, this was an experimental feature.
Starting with Dragonfly v1.11, the feature is stable and enabled by default.
In Dragonfly v1.15, the original sorted set implementation from Redis was removed.

When Tax is Bigger Than Payment

Around eight months ago, when I was inspecting the memory usage profile for one of our cloud customers,
I noticed their memory usage seemed too large for the number of entries they stored.
On further inspection, I noticed that this particular customer used sorted sets, and specifically, they had many sorted set entries with many thousands of members in each one of them.

When we first launched Dragonfly, we reused most of the existing Redis data structures and instead focused on design changes first: multi-threading, transactional support, and replication.
Once we achieved stability with our core features, we decided to dive in and analyze the Redis sorted set implementation.

Underneath, Redis utilizes a data structure called skiplist to store entries in a sorted set when it surpasses 128 elements.¹
During my analysis of Redis's skiplist implementation, I observed that, on average, it requires 37 bytes per entry
in addition to the essential 16 bytes for storing the entry itself.
The 16 bytes are essential since for one entry in a sorted set:

The member is a string, which requires an 8-byte string pointer.²
The score is a double-precision floating-point number, which also requires 8 bytes.

# Using the ZADD command to add a member with its score to a sorted set.
ZADD key score member

This can also be observed in the simplified Redis skiplist node structure below.
We will come back to this definition later in the blog post for more details.

// Assuming 64-bit systems, the pointer size is 8 bytes.

/* ZSETs use a specialized version of Skiplists */
typedef struct zskiplistNode {
    sds ele;      // member (or element), pointer to an SDS string, 8 bytes
    double score; // score, double-precision floating-point number, 8 bytes
    // ...        // other fields, additional metadata
} zskiplistNode;

This 16-byte size, as needed to store and point to the actual data, is a theoretical lower bound for a single entry in a sorted set.
On top of that, additional metadata is necessary in the data structure to maintain the sorted order of these entries.
I will refer to the sorted set entry as a (member, score) pair to emphasize the 16-byte memory usage when necessary.

For instance, for an entry with a field length of 16 characters, Redis stores 32 bytes of useful data, containing 16 characters of string and the 16-byte (member, score) pair,
plus an additional 37 bytes for skiplist metadata. This results in more than a 100% tax!
And it is a fairly common use case as well, since when using sorted sets, we tend to store things like IDs in them instead of string blogs that are too large.

Of course, such overhead isn't problematic if justified and comparable to other implementations' overhead per entry.
Frankly, I wasn't sure what to expect, as I hadn't kept up with the latest developments in this area.
However, a Google search led me to a C++ project named cpp-btree.
My tests showed that this implementation could achieve as little as 2 bytes of overhead per entry, which was promising!
Their design uses a classic B-tree,
but they attained remarkable memory efficiency through 'bucketing'—a technique commonly used in advanced hash tables.
This technique involves grouping multiple entries in a single tree node, significantly reducing the metadata overhead per entry.
This project has since been integrated into the Abseil C++ library, which Dragonfly also utilizes.

Skiplist vs. B-tree

To understand the differences between skiplist and B-tree approaches, we first need to understand the skiplist design.
A skiplist consists of multiple layers of linked lists.
The bottom layer is a standard linked list containing all items.
The layer above it is sparser, containing approximately half the items.
Each successive layer contains half the number of items as the layer immediately below it.
The additional layers above the bottom one act as express lanes to speed up the lookup operations and to provide O(log N) complexity on average.

The complete Redis skiplist node looks as follows:

// Assuming 64-bit systems, the pointer size is 8 bytes.

/* ZSETs use a specialized version of Skiplists */
typedef struct zskiplistNode {
    sds ele;      // member (or element), pointer to an SDS string, 8 bytes
    double score; // score, double-precision floating-point number, 8 bytes
    struct zskiplistNode *backward;    // backward pointer, 8 bytes
    struct zskiplistLevel {
        struct zskiplistNode *forward; // forward pointer, 8 bytes
        unsigned long span;            // typically 8 bytes
    } level[];                         // variable size, but at least 1 level
} zskiplistNode;

The allocation of the level array is determined by the node's position in the structure, as depicted in the diagram.
A node always requires at least one level, but it may have more to be part of those sparser express lanes.
Essentially, a node comprises a 16-byte (member, score) pair, a backward pointer for the linked list (8 bytes),
and a level array (16 bytes x the number of levels it reaches).
To sum up, a node will occupy:

A minimum of 40 bytes (i.e., nodes 2, 5, 7, 8, 10 in the skiplist diagram)
Every second node of 56 bytes (i.e., nodes 3, 9 in the skiplist diagram)
Every fourth node of 72 bytes (i.e., nodes 4, 6 in the skiplist diagram)
And so on…

In Dragonfly code specifically, the allocator adjusts allocation sizes to multiples of 16 bytes, so the minimum size of a skiplist node in Dragonfly is 48 bytes.

So where does the average of 37 bytes of overhead come from?
A 37-byte overhead means that the average total size of an entry is 37 bytes + 16 bytes for the (member, score) pair, or 53 bytes.
Based on the knowledge we have about Redis skiplist nodes,
we can compute the expected weighted average of 4 entries in a skiplist (i.e., nodes 2, 3, 4, 5 in the skiplist diagram):

(48 + 56 + 72 + 48) / 4  = 56 bytes per entry

This computation is close enough to the empirical evidence we got.

In contrast, a B-tree holds multiple elements within each node.
The cpp-btree design utilizes a 256-byte node array capable of containing up to 15 (member, score) pairs.

This implies that the branching factor for such a tree ranges from 7 to 15.
For example, with 1000 entries, at least 67 leaf nodes are required, plus 5 to 10 inner nodes, which amounts to:

77 nodes x 256 bytes/node ÷ 1000 entries = 19.2 bytes per entry

This translates to an average overhead of 2 to 3 bytes per entry, depending on the load of the inner nodes.

I Know What You Did Last Summer

Unfortunately, my excitement was a bit premature because the Redis sorted set API requires custom functionality around its ranking API.
This is not something standard B+ trees provide out of the box, and cpp-btree is not an exception.
Without support for the ranking API, I would not be able to implement ZRANK and similar commands that need to compute the rankings of the elements quickly.

There was no easy way to add the ranking API to cpp-btree without intrusive changes, so I decided to implement the Dragonfly B+ tree as a side project.
It went faster than anticipated, since we did not need to implement a fully generic tree. Instead, we only needed an implementation tuned to Dragonfly use-cases.
At the end, we achieved the expected 2-3 bytes overhead on average per entry, compared to 37 bytes overhead with the original skiplist implementation.
Our implementation is also faster, which becomes significant when running queries on really large sorted sets of orders of tens of thousands or even more.

Benchmark Results

We used the redis-zbench-go tool to benchmark sorted sets.³
Redis and Dragonfly both have sorted set implementations that employ listpacks for sets with lengths up to 128.
However, when it comes to longer sets, Redis switches to a skiplist implementation, while Dragonfly utilizes its aforementioned B+ tree implementation.
To comprehensively evaluate performance across these configurations, we devised two distinct loadtest profiles: ZADD (10-128) and ZADD (129-200).

The ZADD (10-128) profile sends 1 million ZADD commands, each containing 10 to 128 elements.
On the other hand, for the ZADD (129-200) profile, we dispatched 800k commands, with each containing 129 to 200 elements to servers under test.
For both profiles, we first ran them on Redis v7.
Then, we run them on a Dragonfly instance using only one thread, namely Dragonfly-1, to show how it compares with Redis.
Finally, both profiles were run on Dragonfly with eight threads, namely Dragonfly-8, demonstrating its vertical scalability when more CPUs are available.

Please note that the CPU performance of both Redis and Dragonfly sorted set implementations is dominated by other CPU-intensive tasks
that backends need to perform in order to process a request. Hence, the impact of CPU optimizations on overall throughput is limited in these tests.

As you can see, Dragonfly in single-threaded mode can sustain a little bit higher throughput,
but the nice thing about it is how it scales vertically efficiently, reaching 4-5x on 8 threads.
Please note that the initial motivation for building better sorted sets was memory efficiency, and higher QPS is just an added bonus.
The next graph demonstrates the memory usage of all servers after all the commands were sent.

As you can see, ZADD (10-128) does not show any difference in memory usage.
It is expected given that Dragonfly uses the same listpack data structure for small sets like Redis.
However, with large sorted sets, Dragonfly is much more efficient in terms of memory usage.
One can observe up to 40% memory reduction when using Dragonfly.
There is not much difference between using multiple threads and a single thread in this case.

Conclusion

In this blog post, we've journeyed through the innovations taken by Dragonfly in enhancing the efficiency of sorted sets,
a fundamental data structure that can support various use cases such as leaderboards, time-series data, and priority queues.

Dragonfly initially concentrated on pivotal architectural design choices, adopting a multi-threaded,
shared-nothing model and leveraging Redis's existing data structures to establish a solid foundation.
The innovation never stopped, and we've evolved beyond these beginnings with the introduction of a new sorted-set implementation based on B+ tree.
This advancement significantly enhances memory efficiency and represents Dragonfly's ongoing commitment to pushing the limits of being the most advanced in-memory data store.

Take our word for it, but also try it out by deploying a Dragonfly instance
to see for yourself how Dragonfly is setting new standards in data structure optimization and performance enhancement.

```shell
# Redis, ZADD (10-128)
redis-zbench-go -mode load -r 1000000 -p 6379 -key-elements-min=10 -key-elements-max=128 # Load
redis-zbench-go -mode query -r 1000000 -p 6379 -n 10000000 -query zrange-byscore         # Query

# Redis, ZADD (129-200)
redis-zbench-go -mode load -r 800000 -p 6379 -key-elements-min=129 -key-elements-max=200 # Load
redis-zbench-go -mode query -r 1000000 -p 6379 -n 10000000 -query zrange-byscore         # Query

# Dragonfly, 1 Thread, ZADD (10-128)
redis-zbench-go -mode load -r 1000000 -p 6379 -key-elements-min=10 -key-elements-max=128 # Load
redis-zbench-go -mode query -r 1000000 -p 6379 -n 10000000 -query zrange-byscore         # Query

# Dragonfly, 1 Thread, ZADD (129-200)
redis-zbench-go -mode load -r 800000 -p 6379 -key-elements-min=129 -key-elements-max=200 # Load
redis-zbench-go -mode query -r 1000000 -p 6379 -n 10000000 -query zrange-byscore         # Query

# Dragonfly, 8 Threads, ZADD (10-128)
redis-zbench-go -mode load -r 1000000 -p 6379 -key-elements-min=10 -key-elements-max=128 -c 160 # Load
redis-zbench-go -mode query -r 1000000 -p 6379 -n 10000000 -query zrange-byscore -c 160         # Query

# Dragonfly, 8 Threads, ZADD (129-200)
redis-zbench-go -mode load -r 800000 -p 6379 -key-elements-min=129 -key-elements-max=200 -c 160 # Load
redis-zbench-go -mode query -r 1000000 -p 6379 -n 10000000 -query zrange-byscore -c 200         # Query
```

For small collections (hashes, lists, sorted sets), Redis uses a very memory-efficient encoding called listpack
that just stores all the elements of a collection in a single blob, serialized linearly one after another.
Listpack is indeed very memory efficient, but it has terrible O(N) access complexity, thus fitting only for collections with a small number of elements.
Before Redis 7.0, another encoding called
ziplist was used for small hashes, lists, and sorted sets. ↩
The string pointer in this context is a pointer to the Redis Simple Dynamic String (SDS) data structure,
which is a much more efficient and flexible alternative to the standard C string. ↩
Commands for benchmarking using redis-zbench-go: ↩

Preview of Dragonfly Cluster

Dragonfly — Fri, 23 Aug 2024 13:43:15 +0000

Note: At the time of writing, Dragonfly Cluster is not officially released yet.
However, many features and cluster-related commands described in this article are already available in the Dragonfly main branch.
We are actively testing and improving this amazing feature.

Introduction

Dragonfly excels at high performance and vertical scaling, making it a top choice for demanding modern data workloads.
Soon, Dragonfly Cluster will offer horizontal scalability as well, expanding its capabilities even further.
In this article, I want to show you how to run Dragonfly Cluster and provide a concise overview of the Dragonfly Cluster internal processes.

Dragonfly Cluster Overview

The hashing space in the cluster is divided into 16384 slots.
Each key is hashed into a slot. The hash slot is computed from the key name using the CRC16 algorithm and then taking the modulus of 16384.
Each node in a cluster is responsible for a subset of these slots.

Cluster Modes

Dragonfly has two cluster modes.

Development & Testing Environments: Provide a simplified setup for rapid iteration.
Migration Phases: Serve as an interim solution when transitioning from a stand-alone to a clustered setup.
Resource-Constrained Scenarios: A Dragonfly instance in emulated cluster mode can optimize resource utilization by acting as a replica for multiple shards, allowing a single node to replicate several cluster nodes efficiently.

The SELECT command is not permitted.
All keys in a multi-key operation (i.e., multi-key commands, transactions, and Lua scripts) must belong to the same slot. Otherwise, a CROSSSLOT error is returned.

Dragonfly supports some CLUSTER commands for compatibility with Redis Cluster clients. These commands primarily provide informational data about the cluster setup:

CLUSTER HELP: Lists available CLUSTER commands.
CLUSTER MYID: Returns the node ID.
CLUSTER SHARDS: Displays information about cluster shards.
CLUSTER SLOTS: Lists all slots and their associated nodes.
CLUSTER NODES: Shows information about all nodes in the cluster.
CLUSTER INFO: Provides general information about the cluster.

Dragonfly Cluster Management

`DFLYCLUSTER` Commands

DFLYCLUSTER commands are specific to Dragonfly and offer more advanced cluster management capabilities:

DFLYCLUSTER CONFIG: Manages node roles, slot assignments, and migration processes.
DFLYCLUSTER GETSLOTINFO: Provides in-depth statistics about slot utilization, including key count, memory usage, and read/write operations.
DFLYCLUSTER FLUSHSLOTS: Efficiently clears data from specific slots.
DFLYCLUSTER SLOT-MIGRATION-STATUS: Monitors the progress of slot migrations, indicating the current state and completion status.

Cluster Creation

To begin building a Dragonfly cluster, we'll start by launching two separate Dragonfly instances in cluster mode.
Each instance will have a unique ID.

$> ./dragonfly --cluster_mode=yes --admin_port=31001 --port 30001
$> ./dragonfly --cluster_mode=yes --admin_port=31002 --port 30002

Once the instances are running, we need to retrieve their unique IDs by using the following commands:

$> redis-cli -p 31001 CLUSTER MYID
$> redis-cli -p 31002 CLUSTER MYID

In my case, the two unique IDs have prefixes 97486c... and 728cf2... respectively.
Now we can create a cluster config in JSON format by plugging in the unique IDs and IP addresses of the two nodes above.

[
  {
    "slot_ranges": [
      {
        "start": 0,
        "end": 999
      }
    ],
    "master": {
      "id": "97486c9d7e0507e1edb2dfba4655224d5b61c5e2",
      "ip": "localhost",
      "port": 30001
    },
    "replicas": []
  },
  {
    "slot_ranges": [
      {
        "start": 1000,
        "end": 16383
      }
    ],
    "master": {
      "id": "728cf25ecd4d1230805754ff98939321d72d23ef",
      "ip": "localhost",
      "port": 30002
    },
    "replicas": []
  }
]

By using the DFLYCLUSTER CONFIG command, the file is sent to both our nodes and the Dragonfly Cluster is created.

Slot Migration

[
  {
    "slot_ranges": [
      {
        "start": 0,
        "end": 999
      }
    ],
    "master": {
      "id": "97486c9d7e0507e1edb2dfba4655224d5b61c5e2",
      "ip": "localhost",
      "port": 30001
    },
    "replicas": []
  },
  {
    "slot_ranges": [
      {
        "start": 1000,
        "end": 16383
      }
    ],
    "master": {
      "id": "728cf25ecd4d1230805754ff98939321d72d23ef",
      "ip": "localhost",
      "port": 30002
    },
    "replicas": [],
    "migrations": [
      {
        "slot_ranges": [
          {
            "start": 1000,
            "end": 8000
          }
        ],
        "node_id": "97486c9d7e0507e1edb2dfba4655224d5b61c5e2",
        "ip": "localhost",
        "port": 31001
      }
    ]
  }
]

$> redis-cli -p 31002 DFLYCLUSTER SLOT-MIGRATION-STATUS

Once the migration status shows FINISHED, the new cluster configuration (with updated slot_ranges) can be applied to all nodes:

[
  {
    "slot_ranges": [
      {
        "start": 0,
        "end": 8000
      }
    ],
    "master": {
      "id": "97486c9d7e0507e1edb2dfba4655224d5b61c5e2",
      "ip": "localhost",
      "port": 30001
    },
    "replicas": []
  },
  {
    "slot_ranges": [
      {
        "start": 8001,
        "end": 16383
      }
    ],
    "master": {
      "id": "728cf25ecd4d1230805754ff98939321d72d23ef",
      "ip": "localhost",
      "port": 30002
    },
    "replicas": []
  }
]

Replicas

Slot Migration Process Under the Hood

Initiating Migration: The process begins with the DFLYCLUSTER CONFIG command sent to the source and target nodes to configure migration parameters.
Preparing the Target Node: The source node sends DFLYMIGRATE INIT [SOURCE_NODE_ID, SHARDS_NUM, SLOT_RANGES] to the target node. The target node responds OK, indicating it is ready to receive data.
Setting Up Data Transfer: For each storage thread-shard (a data segment handled by a single thread), the source node sends DFLYMIGRATE FLOW [SOURCE_NODE_ID, FLOW_ID]. Each DFLYMIGRATE FLOW command sets up a connection for data transfer, confirmed by an OK response.
Transferring Data: Using the established connections from the DFLYMIGRATE FLOW commands, the source node serializes and transfers data to the target node.
Finalizing Migration: After the data transfer, the migrated slots on the source node are blocked to prevent further data changes. The source node then sends a finalization request for each FLOW connection to conclude the data transfer.
Completing the Process: The source node issues DFLYMIGRATE ACK [SOURCE_NODE_ID, ATTEMPT_ID] to the target node to finalize the entire migration. The target node responds with ATTEMPT_ID, completing the migration. The ATTEMPT_ID is used to handle errors that may arise during the finalization process.

Snapshot Creation

Journal Serialization

While handling the snapshot, Dragonfly also manages a journal, which logs all recent write operations.
These journal entries are serialized and sent to the target node along with the snapshot data.

Let's look at a small example to illustrate the process:

There are several data entries: A, B, C, and D.
Data entries A and B are serialized and sent to the target node.
A new MSET command is issued by the client, updating B and D.
Because B is already serialized previously, we do nothing with it for now.
Data entry D is serialized and sent to the target node.
The journal gets the update about B and D, serializes it, and sends it to the target node.
Finally, data entry C is serialized and sent to the target node.

By following this process, Dragonfly ensures that the target node receives a consistent version of the source node's data,
including all recent write operations during the slot migration process.

Conclusion

2024 New Year, New Number: New Benchmarks Show Dragonfly Achieves 6.43 Million Ops/Sec on an AWS Graviton3E Instance

Dragonfly — Thu, 25 Jan 2024 17:00:00 +0000

Introduction

Last year, we published benchmark results showing that Dragonfly can achieve 4 million ops/sec on an AWS c6gn.16xlarge instance.
To put that into perspective, it's like every person in Los Angeles (as of 2020, roughly 3.9 million residents) asking Dragonfly a question, and Dragonfly answers instantly, all within that single second.

However, we have exciting news from our latest benchmarks:
by operating on a new AWS c7gn.16xlarge single instance, we've now achieved 6.43 million ops/second with Dragonfly.

Number of Dragonfly Threads	Max Ops/Second (P99.9 < 10ms)
1	302,603
2	329,755
4	744,708
8	1,370,980
16	2,749,539
32	4,263,950
64	6,432,982

Beyond the Number - Dragonfly Advances with Hardware

This latest benchmark achievement goes beyond just the number.
It's about how Dragonfly leverages hardware advancements to boost performance.
Normally, a 60.75% increase in throughput — going from 4 million to 6.43 million operations per second — might suggest major code optimizations or architectural changes.
But that's not the case for Dragonfly.

First, let's take a look at the hardware advancements.
At the beginning of 2023, we used the c6gn.16xlarge instance for throughput benchmarking.
Around June 2023, AWS announced the c7gn series,
which are powered by ARM-based AWS Graviton3E processors that deliver up to 25% better performance over Graviton2-based c6gn instances.
They are ideal for a large variety of compute-intensive workloads, including in-memory data stores like Dragonfly.
Network-wise, the c7gn instances deliver up to 200Gbps of network bandwidth and up to 2x higher packet processing performance than the previous-generation c6gn instances.
Below is a quick comparison of the two instances.
It is notable that the price of the c7gn instance is ~44% higher than the c6gn instance,
but the Dragonfly throughput gain is ~60.75%, which demonstrates how cost-efficient Dragonfly is.

	`c6gn.16xlarge`	`c7gn.16xlarge`
vCPUs	64	64
Memory	128GiB	128GiB
Physical Processor	AWS Graviton2 Processor	AWS Graviton3 Processor
Clock Speed (GHz)	2.5	2.6
Network Performance (Gibps)	100	200
Price (us-east-1, January 2024)	$2.765/hour	$3.994/hour

Over the past year, we've certainly refined Dragonfly's performance.
For instance, we enhanced the performance of the MGET command,
improved the overall performance and memory efficiency of the Sorted-Set data type,
optimized multiple aspects to get 30x throughput with BullMQ, and many more.
Yet, the core design and architecture remain unchanged.
And this is key: Dragonfly's architecture is designed to scale vertically with hardware improvements.
It's not just about tweaking the codebase for better throughput; it's about the fundamental architecture that inherently capitalizes on the evolving capabilities of the hardware it runs on.
Thus, we are confident to say that the existing Dragonfly codebase, robust and efficient as it is, is ready to see further performance gains as hardware advances in the future.

One of the key design elements that allows Dragonfly to take advantage of advancements in hardware is its multi-threaded shared-nothing architecture.
Below, we will dive deeper into how this architecture automatically unlocks performance from advancements in hardware.

Dragonfly Architecture Overview

Dragonfly runs as a single process with multiple threads, and it is designed to run on multi-core servers without any special configurations or optimizations.
Dragonfly's in-memory data store keyspace is sharded into N parts, where N is less than or equal to the number of CPU logical cores in the system.
Each shard is owned and managed by a single Dragonfly thread, establishing a shared-nothing architecture.

To put it simply, Dragonfly's thread-per-core model and shared-nothing architecture make it like an already perfectly orchestrated group of Redis processes without the overhead of cluster management.
This is why Dragonfly can automatically achieve higher throughput performance when put on more powerful machines:
more Dragonfly threads can be created if there are more CPU logical cores available,
and when each core is more powerful, each Dragonfly thread can inherently handle more operations per second.

In the meantime, atomicity is crucial for Dragonfly key-value operations.
It is not possible to use mutexes or spinlocks to orchestrate threads at the rates described above, as they would immediately cause contention.
Much like a busy high-speed multi-lane road being regulated by a traffic light that only permits one car to pass at a time, the presence of such a bottleneck leads to considerable congestion.
To provide atomicity guarantees for multi-key operations in the shared-nothing architecture, we incorporate recent academic research.
Specifically, we've based our transactional framework for Dragonfly on the paper "VLL: a lock manager redesign for main memory database systems".
By adopting a shared-nothing architecture and VLL, we can achieve multi-key operations without relying on mutexes or spinlocks.

Let's take a look at some examples.
When Dragonfly receives a simple GET or SET command, it will first calculate the hash of the key and then find the corresponding shard.
Then the data manipulation will be performed by the specific Dragonfly thread that owns the shard, while other threads are free to process more commands without heavy congestion.
A more complex example would be a Dragonfly Search command, which is a multi-key operation.
When Dragonfly receives a search command, it will first parse the command to build a query execution tree.
Then Dragonfly descends to all the threads, where each thread has a separate index that knows about all the indexed values in the corresponding shard.
Multiple threads will execute the query tree efficiently in parallel, and the results will be merged and returned to the client.
The shared-nothing architecture and VLL guarantee that the multi-key operations are atomic, consistent, and highly efficient.

How to Repeat the Benchmark Results

As shown in the table and chart above, Dragonfly is able to reach 6.43 million ops/sec with 64 Dragonfly threads on a single instance,
while maintaining a P50 latency of 0.3ms, a P99 latency of 1.1ms, and a P99.9 latency of 1.5ms.
The benchmark was conducted with multiple Dragonfly thread configurations and memtier_benchmark thread & connection configurations,
and we take the maximum ops/sec value with a P99.9 latency of less than 10ms as the final result set.
Here are the detailed steps to repeat the benchmark results:

Use two AWS c7gn.16xlarge instances, one for the server (i.e., Dragonfly) and one for the client (i.e., the memtier_benchmark CLI). The c7gn.16xlarge instance has 64 vCPUs, 128GiB memory, and is equipped with the AWS Graviton3 processor.
Make sure the network configuration between the two instances is correct, so that the client can connect to the server.
For the Dragonfly server, use the following configuration arguments:
- --proactor_threads=64, which specifies the number of Dragonfly threads.
Fill the Dragonfly server with 10 million keys, to simulate a data store that is in use.

  dragonfly$> DEBUG POPULATE 10000000 key 550

Use the following memtier_benchmark command on the client instance to perform benchmark and collect results:

  $> memtier_benchmark -p 6380 --ratio=<get_set_ration> \ # 1:0 for GET, 0:1 for SET
     --hide-histogram \
     --threads=<number_of_client_threads> \ # 1,2,4,...,64
     --clients=<number_of_client_connections> \ # 10, 25, 40, 50
     --requests=200000 --distinct-client-seed \
     --data-size 256 --expiry-range=500-500 \
     -s <host_of_dragonfly_server> \

You can plug in different --proactor_threads values for Dragonfly and various --threads and --clients values for memtier_benchmark to repeat the benchmark.
We reached 6.43 million ops/sec with 64 Dragonfly threads, 64 memtier_benchmark threads, and 40 client connections.

Conclusion

In this blog post, we have discussed how Dragonfly can automatically and fully utilize the hardware it operates on by
leveraging the thread-per-core model and shared-nothing architecture.
With the latest benchmark results, Dragonfly again stakes its claim as the most performant in-memory data store on earth.
Not just the number, but the overall design and architecture give developers even more confidence in Dragonfly's ability to handle the most demanding workloads.

Feel free to check out our GitHub repository, documentation,
and get started with Dragonfly by running it locally with Docker using one single command.
Happy building, and stay tuned for more updates in 2024!

Scaling Real-Time Leaderboards with Dragonfly

Dragonfly — Fri, 19 Jan 2024 17:00:00 +0000

Introduction

In today's digital age, leaderboards have become an integral part of many applications, providing a dynamic way to display user scores and rankings.
To build gamification features for any application (i.e., games, educational platforms), leaderboards serve as a powerful tool to engage and motivate users.
In this blog post, we're going to delve into the process of building a practical and realistic leaderboard system.

Our journey will involve leveraging the capabilities of Dragonfly, a highly efficient drop-in replacement for Redis,
known for its ultra-high throughput and multi-threaded share-nothing architecture.
Specifically, we'll be utilizing two of Dragonfly's data types: Sorted-Set and Hash.
These data structures are perfect for handling real-time data and ranking systems, making them ideal for our leaderboards.

Moreover, to ensure that our leaderboards are not just real-time but also persistent, we will be integrating a SQL database (PostgreSQL) into our system.
This approach allows us to maintain a comprehensive record of user scores over different time frames.
As a result, we'll be capable of showcasing three distinct types of leaderboards:

An all-time leaderboard that reflects overall user scores.
A current-week leaderboard that captures the most recent user activities.
Leaderboards for previous weeks, giving users insights into past trends and performances, potentially also providing rewards and prizes for top performers.

Through this implementation, we aim to demonstrate how Dragonfly, in conjunction with traditional SQL databases,
can be utilized to create robust, scalable, and efficient leaderboard systems. So, let's dive in and start building!

Implementation

1. Database Schema

In the implementation of our leaderboard system, a carefully designed SQL database schema plays a pivotal role.
At the core of this schema is the users table, which is essential for storing basic user information.
This table includes fields like id (a unique identifier for each user, automatically incremented as BIGSERIAL),
email (a unique field to prevent duplicate registrations), password, username,
and timestamps created_at and updated_at to track the creation and last update of each user record.
Note that the password field should store the hashed or encrypted version of the user's password for security purposes.

CREATE TABLE IF NOT EXISTS users
(
    id         BIGSERIAL PRIMARY KEY,
    email      VARCHAR(255) UNIQUE NOT NULL,
    password   VARCHAR(255)        NOT NULL,
    username   VARCHAR(255)        NOT NULL DEFAULT '',
    created_at TIMESTAMPTZ         NOT NULL DEFAULT NOW(),
    updated_at TIMESTAMPTZ         NOT NULL DEFAULT NOW()
);

Next, we have the user_score_transactions table, which logs all score transactions for users.
It consists of an id as a unique transaction identifier, user_id linking to the users table,
score_added representing the score change, reason for the score change (such as winning a game or completing a task),
and a created_at timestamp for the transaction record.

CREATE TABLE IF NOT EXISTS user_score_transactions
(
    id          BIGSERIAL PRIMARY KEY,
    user_id     BIGINT       NOT NULL REFERENCES users (id),
    score_added INT          NOT NULL,
    reason      VARCHAR(255) NOT NULL,
    created_at  TIMESTAMPTZ  NOT NULL DEFAULT NOW()
);

Finally, the user_total_scores table is dedicated to maintaining the cumulative scores of each user.
It contains an id for each record, user_id to reference the users table, total_score indicating the user's overall score,
and an updated_at timestamp for the last score update.

CREATE TABLE IF NOT EXISTS user_total_scores
(
    id          BIGSERIAL PRIMARY KEY,
    user_id     BIGINT      NOT NULL REFERENCES users (id),
    total_score INT         NOT NULL DEFAULT 0,
    updated_at  TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

This schema is particularly effective due to its emphasis on normalization, which reduces redundancy by segregating user
information, score transactions, and total scores into distinct tables.
It ensures scalability with the use of BIGSERIAL and BIGINT data types, accommodating a large volume of records.
Additionally, the separate user_score_transactions table offers valuable insights into the score history for each user,
which is beneficial for analytics and audit trails. We will also create materialized views to further support leaderboards for previous weeks as we will see later.
By isolating the total scores in the user_total_scores table, the system can swiftly access and update a user's total score, enhancing performance.
This well-structured schema thus forms the backbone of our leaderboard system, supporting both real-time updates and a comprehensive score history.

2. Dragonfly Keys & Data Types

With the database schema in place, we can now focus on the Dragonfly key-value pairs that will be used to store the leaderboard data.
The Sorted-Set data type is ideal for storing user scores and rankings, while the Hash data type is perfect for storing user information that is needed for display purposes.
Here are the keys and data types that we will be using:

leaderboard:user_scores:all_time (Sorted-Set): Stores the user IDs and scores for the all-time leaderboard.
leaderboard:user_scores:week_of_{monday_of_the_week} (Sorted-Set): Stores the user IDs and scores for a specific week.
leaderboard:users:{user_id} (HASH): Stores the user information for a specific user.

An example of the key space would look like this:

dragonfly$> KEYS leaderboard:*
1) "leaderboard:user_scores:all_time"           # Sorted-Set
2) "leaderboard:user_scores:week_of_2024_01_15" # Sorted-Set
3) "leaderboard:users:1"                        # Hash
4) "leaderboard:users:2"                        # Hash
5) "leaderboard:users:3"                        # Hash
6) ...

3. All-Time & Current-Week Leaderboards

In the implementation of the all-time leaderboard and current-week leaderboard, we focus on how scores are updated for a user and how the top 100 users are queried from these leaderboards.

To update scores, we first record the score transaction in the user_score_transactions table and then update the user_total_scores table.
This operation should be wrapped in a database transaction to ensure data integrity.

BEGIN;

-- Record score transaction for user with ID 1.
INSERT INTO user_score_transactions (user_id, score_added, reason)
VALUES (1, 100, 'WINNING_A_GAME');

-- Update total score for user with ID 1.
UPDATE user_total_scores
SET total_score = total_score + 100,
    updated_at  = NOW()
WHERE user_id = 1;

COMMIT;

Next, we update the all-time leaderboard and current-week leaderboard in Dragonfly.
Note that the operations are better pipelined to reduce the number of round-trips between the application and Dragonfly.

dragonfly$> ZINCRBY leaderboard:user_scores:all_time 100 1
dragonfly$> ZINCRBY leaderboard:user_scores:week_of_2024_01_15 100 1

Now that we have persisted with the score change in the database and updated the values in Dragonfly as well,
when querying the top 100 users from a leaderboard (all-time or current-week), we can simply use the ZREVRANGE command
to retrieve the top users from the Sorted-Set, and then use the HGETALL commands to retrieve user details from the Hash keys.

dragonfly$> ZREVRANGE leaderboard:user_scores:all_time 0 99 WITHSCORES
 1) "1"    # user_id = 1
 2) "1000" # score for user_id = 1
 3) "2"    # user_id = 2
 4) "900"  # score for user_id = 2
 5) "3"
 6) "800"
 7) "4"
 8) "700"
 9) "5"
10) "600"
# ...

dragonfly$> HGETALL leaderboard:users:1
dragonfly$> HGETALL leaderboard:users:2
dragonfly$> HGETALL leaderboard:users:3
dragonfly$> HGETALL leaderboard:users:4
dragonfly$> HGETALL leaderboard:users:5
# ...

Depending on how many users are recorded in the leaderboard:user_scores:all_time key,
we need to use 1 ZREVRANGE command and potentially 100 HGETALL commands to retrieve the top users.
This may sound like a lot of commands, but once again, we can pipeline these commands to reduce the number of round-trips between the application and Dragonfly.
In fact, the top user scores with their details can be retrieved in a single round-trip, and the response time should still be within a few milliseconds.
On the other hand, we completely avoid the need to query the database for the top users, which is a much more expensive operation.
This is why we are confident in saying that Dragonfly is providing a real-time experience for leaderboard retrieval.

4. Leaderboards for Previous Weeks

For the implementation of leaderboards for previous weeks, we adopted a strategy that efficiently balances database querying with caching.
The process involves two main steps: creating materialized views and leveraging Dragonfly's caching capabilities.

We utilize the user_score_transactions table to generate materialized views for each past week's leaderboard.
Materialized views are essentially snapshots of the query results, stored for efficient access.
These views are created by aggregating the scores from the user_score_transactions table for each user over a specific week.
An example SQL statement to create a materialized view for a specific week might look like this:

CREATE MATERIALIZED VIEW leaderboard_week_of_2024_01_15 AS
SELECT u.id, u.username, u.email, sum(ust.score_added) AS weekly_score
FROM user_score_transactions ust
         JOIN users u ON ust.user_id = u.id
WHERE ust.created_at BETWEEN '2024-01-15 00:00:00' AND '2024-01-21 23:59:59'
GROUP BY u.id
ORDER BY weekly_score DESC;

Once the materialized view for a week's leaderboard is created, we can cache its results in Dragonfly to facilitate quick retrieval.
We utilize Dragonfly's String data type to store the serialized form of the leaderboard, which can be in JSON, XML, or any other format.
The reason is that past leaderboards cannot be changed anymore, and the order is preserved in the materialized view, so we can simply cache the results as-is.

SELECT * FROM leaderboard_week_of_2024_01_15 LIMIT 100;

dragonfly$> SET leaderboard:cache_top_100:week_of_2024_01_15 'serialized_leaderboard_data'

Other Considerations

1. Calculating the Start of the Week

For the weekly leaderboards, it's essential to have a consistent method to determine the start of each week, commonly set as Monday.
This calculation is vital because it impacts both the naming conventions of keys in Dragonfly and the logic for creating and refreshing materialized views in the database.
Implementing helper methods in the application code that accurately calculate the Monday of any given week is necessary.
This consistency ensures that both the database views and the Dragonfly keys are synchronized in terms of the time periods they represent.
Such an implementation in Go might look like this:

// MondayOfTime returns the Monday of the week of the given time.
func MondayOfTime(ts time.Time) time.Time {
    tt := ts.UTC()
    weekday := tt.Weekday()
    if weekday == time.Monday {
        return tt.Truncate(24 * time.Hour)
    }
    daysToSubtract := (weekday - time.Monday + 7) % 7
    return tt.AddDate(0, 0, -int(daysToSubtract)).Truncate(24 * time.Hour)
}

// MondayOfTimeStr returns the Monday of the week of the given time in string format.
func MondayOfTimeStr(ts time.Time) string {
    return MondayOfTime(ts).Format("2006_01_02")
}

2. Management of Dragonfly Keys

The all-time leaderboard data, represented by a Sorted-Set key in Dragonfly, is a long-term data set that can be kept indefinitely.
This key does not require an expiration as it continuously accumulates user scores over time.

Conversely, the current-week Sorted-Set key in Dragonfly should be managed with an expiration policy.
Setting an expiry time point for this key, preferably at the beginning of the next week, ensures that the data does not become stale and reflects only the current week's scores.
This practice helps in maintaining the relevance and accuracy of the current-week leaderboard.

And finally, the user-detail Hash keys in Dragonfly, shared across all-time and current-week leaderboards, can also be kept indefinitely.
However, it's crucial to keep the data in these user-detail Hash keys up-to-date with the corresponding records in the database.
Whenever a user's details change in the database, these changes should be promptly reflected in the Hash keys in Dragonfly.
This synchronization ensures that the leaderboards always display the most current and accurate user information.

3. Key Naming Conventions

It's important to adopt a clear and distinct naming convention for different types of data stored in Dragonfly.
Specifically, the key names for the current-week Sorted-Set and the cached materialized view (String data type) should be different to prevent confusion.
A clear naming strategy helps avoid accidental operations on the wrong Dragonfly data type.

Conclusion

In this blog, we explored how Dragonfly can be used in conjunction with a SQL database to build a robust and efficient leaderboard system for gaming and other applications.
We discussed various data types and techniques that can be easily utilized to create real-time leaderboards with minimal update and retrieval latency.

We have a recorded workshop session, "Scaling Real-Time Leaderboards", that you can watch here.
Code snippets in this blog post can be found in the Dragonfly examples repository.
Finally, we encourage you to try Dragonfly out for yourself,
experience its capabilities firsthand, and build amazing applications with it!

Dragonfly 2023 in Review and Exciting Glimpses of 2024

Dragonfly — Wed, 03 Jan 2024 17:00:00 +0000

As we edge closer to the start of a new year, we would like to express our sincere appreciation for the exceptional community that has been instrumental in shaping Dragonfly.
Your consistent support and constructive feedback motivate us to strive for greater achievements, and we look forward to navigating 2024 and beyond together.

Our commitment to the development of open features that serve the Redis community and the wider developer ecosystem remains firm.
We welcome and prioritize community-driven feature requests and bug fixes, recognizing the invaluable contributions that
help us achieve the goal of building the most efficient in-memory data store, designed with developers' needs in mind.

Looking Back at 2023

We released 19 new versions of Dragonfly in 2023, each packed with new features, bug fixes, and performance improvements.
Back in March, we hit our first big milestone with the release of Dragonfly v1.0,
demonstrating our commitment to delivering a production-ready database that is ready to scale with your applications.
Besides the release of v1.0, there are a few more highlights from the past year that we'd like to share with you.

Lightning-Fast Performance and High Availability

We ignited a performance revolution early in the year.
Dive into our blog post on turbocharging Dragonfly to explore how we designed Dragonfly to
deliver an exhilarating experience with lightning-fast response times and tremendously high throughput.
In the meantime, we also introduced snapshots and replications to ensure data durability and high availability.

Seamless Integrations with Popular Frameworks

Connecting to the broad Redis community was a major objective for us this year.
It serves our quest for compatibility and demonstrates how simple it is to scale and boost performance with Dragonfly.
Look back at the ways we grew over the past year through integrations with popular frameworks such as BullMQ.
BullMQ is a popular, robust, and fast Node.js library for creating and processing background jobs that uses Redis as a backend.
With high compatibility, Dragonfly can be used as a backend data store for BullMQ with no code changes.
Beyond this, we also optimized Dragonfly for BullMQ,
achieving an exceptional 30x throughput increase from the baseline.

Our latest integration looked at how we integrated with Laravel,
one of the most popular web frameworks not just in PHP but across all languages, as the cache, session, and queue backend.
This again highlights Dragonfly's compatibility.
By simply changing the Redis connection string, you can use Dragonfly as a drop-in replacement for Redis in Laravel,
instantly benefiting from its high performance and throughput, low memory usage, and efficient snapshotting.

Cloud-Native Deployment and Management

The Dragonfly Kubernetes Operator
release makes it extremely simple to deploy Dragonfly in Kubernetes environments.
This release offers benefits such as high availability with custom failover strategies, direct snapshots to cloud storage (e.g., S3),
seamless integration with cloud-native monitoring tools (e.g., Prometheus and Grafana), and more.

AI Revolution

2023 marked the beginning of the AI revolution with GPT-4 and other LLM models that will affect the future of our lives.
To connect businesses to the revolution, we introduced Dragonfly Search,
which allows the storage, retrieval, and searching of AI-generated embeddings utilizing vector similarity search (VSS) capabilities.

A Glimpse of Dragonfly in 2024

Dragonfly Cloud

Exciting developments await in the cloud!
We will continue to add features and functionality to Dragonfly Cloud with the goal of giving developers the most scalable
and cost-effective managed Dragonfly service across multiple cloud providers.
Currently, Dragonfly Cloud is onboarding customers on a waitlist basis.
We aim to open up this highly anticipated service to the public in 2024.

Hardware Efficiency and Persistence

We designed Dragonfly's multi-threaded architecture to be able to utilize modern cloud hardware much more efficiently than legacy solutions.
Recent advancements in SSD performance allow us to take efficiency to new levels.
We plan to deliver a solution where data is seamlessly shared between memory and SSD while still preserving sub-millisecond guarantees.
This will allow developers great flexibility while dramatically reducing the associated costs of maintaining their Dragonfly instances.

Community Centric Development

In 2024, we're launching an array of community initiatives, including monthly office hours every second Wednesday of each month
and online workshops for you to learn and get up and running.
We want to hear your feedback, answer your questions, and collaborate on making Dragonfly the best it can be.

Meet Us at Events Near You!

Dragonfly is taking to the skies in 2024, with a series of events around the globe.
Stay tuned for event announcements on our events page,
and let's make 2024 a year of in-person connections, insights, and shared enthusiasm.

Thank You

As we stand at the threshold of a new year, we're filled with gratitude for the amazing community that has made Dragonfly what it is today.
Your support and feedback drive us to reach new heights.
We can't wait to continue on this thrilling journey with the goal of building the most efficient in-memory data store in the universe.
Here's to innovating faster and building a future where Dragonfly continues to empower developers worldwide.

Farewell to 2023. Now, let's fasten our seatbelts and embark on the journey to make 2024 another year to remember!

Using Laravel with Dragonfly

Dragonfly — Mon, 18 Dec 2023 16:54:30 +0000

Introduction

Dragonfly is a drop-in Redis replacement that delivers far better performance with far fewer servers.
A single node can handle millions of queries per second and up to 1TB of in-memory data.
In this blog post, we will explore how to use Dragonfly with Laravel, one of the most widely used and well-known web frameworks.

Dragonfly maintains full compatibility with the Redis interface,
meaning Laravel developers can integrate it as a cache and queue driver without a single line of code change.
This seamless integration offers an effortless upgrade path with substantial benefits.

So, whether you are a seasoned Laravel veteran or just starting out, join us as we step into the world of Dragonfly and Laravel.

Getting Started

Let's start by setting up a new Dragonfly instance.
Visit our documentation here to download an image or the binary and have a Dragonfly instance up and running in no time.
Once the Dragonfly instance is operational and reachable, integrating it with your Laravel project is a breeze.
Luckily, Laravel already has full support for Redis, so all of its drivers can be reused.
To use Dragonfly in your Laravel application, start by updating the .env file with the following configurations.

For caching and session management:

CACHE_DRIVER=redis
SESSION_DRIVER=redis

To integrate Dragonfly as the queue driver as well:

QUEUE_CONNECTION=redis

Even though we are using redis as the driver value, Dragonfly is designed to be a direct replacement for Redis, so no additional driver installation is required.
With the driver set, the next step is to ensure Laravel can communicate with the Dragonfly instance.
This involves updating the .env file again with the correct connection details:

REDIS_HOST: The hostname or IP address of the Dragonfly server.
REDIS_PORT: The port on which the Dragonfly instance is running.
REDIS_PASSWORD: The password for the Dragonfly instance, if set.

Here's an example configuration:

REDIS_HOST=127.0.0.1 # Replace with Dragonfly host
REDIS_PORT=6379      # Replace with Dragonfly port
REDIS_PASSWORD=null  # Replace with Dragonfly password if applicable

After updating these settings, verify the connection by running a simple operation like INFO in Laravel.
If you encounter any connectivity issues, double-check the host, port, and password values.
Also, ensure that the Dragonfly server is running and accessible from your Laravel application's environment.

use Illuminate\Support\Facades\Redis;

// Run the INFO command and print the Dragonfly version.
Redis::info()["dragonfly_version"];

Higher Efficiency as a Cache

Caching commonly accessed values is one of the primary uses of in-memory databases like Dragonfly and Redis due to their fast response times.
This is where Dragonfly shines, especially in scenarios involving a large number of keys and clients, typical as a central cache of multi-node systems or microservices.

Dragonfly's prowess is not just in its speed but also in its intelligent memory management.
A standout feature is the cache mode, designed specifically for scenarios where maintaining a lean memory footprint is as crucial as performance.
In this mode, Dragonfly smartly evicts the least recently accessed values when it detects low memory availability, ensuring efficient memory usage without sacrificing speed.
You can read more about the eviction algorithm in the Dragonfly Cache Design blog post.

Activating the cache mode is straightforward.
Here are the flags you would use to run Dragonfly in this mode, with a memory cap of 12GB:

./dragonfly --cache_mode --maxmemory=12G

Consider a scenario where your application needs to handle a high volume of requests with a vast dataset.
In such cases, the Dragonfly cache mode can efficiently manage memory usage while providing rapid access to data, ensuring your application remains responsive and agile.

API-wise, all functionality of the Laravel Cache facade should be supported.
For example, to store a given key and value with a specific expiration time, the following snippet can be used:

use Illuminate\Support\Facades\Cache;

// Store a value with a 10 minute expiration time.
Cache::put("key", "value", 600);

Memory Usage

One of the benefits of using Dragonfly as a cache is its measurably lower memory usage for most use cases.
Let's conduct a simple experiment and fill both Redis and Dragonfly with random strings, measuring their total memory usage after filling them with data.

Dataset	Dragonfly	Redis
3 Million Values of Length 1000	2.75GB	3.17GB
15 Million Values of Length 200	3.8GB	4.6GB

After conducting the experiment, we've observed that Dragonfly's memory usage is up to 20% lower compared to Redis under similar conditions.
This allows you to store significantly more useful data with the same memory requirements, making the cache more efficient and achieving higher coverage.
You can read more about Dragonfly throughput benchmarks and memory usage in the Redis vs. Dragonfly Scalability and Performance blog post.

Snapshotting

Beyond just lower memory usage, Dragonfly also demonstrates remarkable stability during snapshotting processes.
Snapshotting, particularly in busy instances, can be a challenge in terms of memory management.
With Redis, capturing a snapshot on a highly active instance might lead to increased memory usage.
This happens because Redis needs to copy memory pages, even those that have only been partially overwritten, resulting in a spike in memory usage.

Dragonfly, in contrast, takes a more adaptive approach to snapshotting.
It intelligently adjusts the order of snapshotting based on incoming requests, effectively preventing any unexpected surges in memory usage.
This means that even during intensive operations like snapshotting, Dragonfly maintains a stable memory footprint, ensuring consistent performance without the risk of sudden memory spikes.
You can read more about the Dragonfly snapshotting algorithm in the Balanced vs. Unbalanced blog post.

Key Stickiness

Dragonfly also introduces a new feature with its custom STICK command.
This command is particularly useful in instances running in cache mode.
It enables specific keys to be marked as non-evicting, irrespective of their access frequency.

This functionality is especially handy for storing seldom-accessed yet important data.
For example, you can reliably keep auxiliary information, like dynamic configuration values, directly on your Dragonfly instance.
This eliminates the need for a separate datastore for infrequently used but crucial data, streamlining your data management process.

// Storing a value in the Dragonfly instance with stickiness.
Redis::transaction(function (Redis $redis) {
    $redis->set('server-dynamic-configuration-key', '...');
    $redis->command('STICK', 'server-dynamic-configuration-key');
});

// ...

// Will always return a value since the key cannot be evicted.
$redis->get('server-dynamic-configuration-key');

Enhanced Throughput in Queue Management

Dragonfly, much like Redis, is adept at managing queues and jobs.
As you might have already guessed, the transition to using Dragonfly for this purpose is seamless, requiring no code modifications.
Consider the following example in Laravel, where a podcast processing job is dispatched:

use App\Jobs\ProcessPodcast;

$podcast = Podcast::create(/* ... */);
ProcessPodcast::dispatchSync($podcast);

Both Dragonfly and Redis are capable of handling tens of thousands of jobs per second with ease.

For those aiming to maximize performance, it's important to note that using a single job queue won't yield significant performance gains.
To truly leverage Dragonfly's capabilities, multiple queues should be utilized.
This approach distributes the load across multiple Dragonfly threads, enhancing overall throughput.

However, a common challenge arises when keys from the same queue end up on different threads, leading to increased latency.
To counter this, Dragonfly offers the use of hashtags in queue names.
These hashtags ensure that jobs in the same queue (which uses the same hashtag) are automatically assigned to specific threads,
much like in a Redis Cluster environment, thereby reducing latency and optimizing performance.
To learn more about hashtags, check out the Running BullMQ with Dragonfly blog post,
which has a detailed explanation of hashtags and their benefits, while Dragonfly is used as a backing store for message queue systems.

As a quick example, to optimize your queue management with Dragonfly, start by launching Dragonfly with specific flags that enable hashtag-based locking and emulated cluster mode:

./dragonfly --lock_on_hashtags --cluster_mode=emulated

Once Dragonfly is running with these settings, incorporate hashtags into your queue names in Laravel. Here's an example:

ProcessPodcast::dispatch($podcast)->onQueue('{podcast_queue}');

By using hashtags in queue names, you ensure that all messages belonging to the same queue are processed by the same thread in Dragonfly.
This approach not only keeps related messages together, enhancing efficiency, but also allows Dragonfly to maximize throughput by distributing different queues across multiple threads.

This method is particularly effective for systems that rely on Dragonfly as a message queue backing store,
as it leverages Dragonfly's multi-threaded architecture to handle a higher volume of messages more efficiently.

Conclusion

Dragonfly emerges as a powerful and efficient alternative to Redis.
Its ability to handle massive workloads with lower memory usage and its multi-threaded architecture make it a compelling choice for modern web applications.
Throughout this post, we've explored how Dragonfly seamlessly integrates with Laravel, requiring minimal to no code changes, whether it's for caching, session management, or queue management.
The unique features like cache mode, key stickiness, and hashtag-based thread balancing further illustrate Dragonfly's innovative approach as an in-memory data store.

And as always, we encourage you to get started with Dragonfly within just a few minutes.
Subscribe to our newsletter below and get connected with us on GitHub and Discord to stay up-to-date with the latest developments.

Announcing Dragonfly Search

Dragonfly — Thu, 07 Dec 2023 17:00:00 +0000

Introduction

2023 has been a year with remarkable advancements in AI capabilities, and at Dragonfly, we are thrilled to power new use cases with our latest release: Dragonfly Search.
This new feature set, debuting in Dragonfly v1.13, is a subset of RediSearch-compatible commands implemented natively in Dragonfly,
allowing for both vector search and faceted search use cases in the highly scalable and performant Dragonfly in-memory data store.

In this post, we will guide you through building a simple recommendation system utilizing OpenAI's embeddings in conjunction with Dragonfly's vector search capabilities.
Additionally, we'll explore how Dragonfly can serve as a versatile document store, demonstrating its flexibility and efficiency in handling diverse data management tasks.

Dragonfly Search is being released in Beta.
We are excited about its development and future potential, but we do not encourage its use in production environments at the time of this writing.
Your feedback is immensely valuable to us, and it plays a critical role in shaping and improving Dragonfly Search as we progress towards a more stable version.
If you have any feedback, please create a GitHub issue or drop us a link in Discord.

If you want to learn more about Dragonfly Search, please register for our Community Office Hours,
where the team will give a technical presentation and take questions.

Fundamentals of Dragonfly Search

Dragonfly Search enables the creation of indexes for selected HASH and JSON values.
Entries stored within or associated with an index are often referred to as documents.
Each index is constructed based on a specific schema, defining the fields within the indexed values and the way they should be interpreted.
Once established, this index facilitates filtering and sorting documents by various properties, much like a traditional database manages conditional queries.

Let's suppose we use Dragonfly to store information about the world's largest cities.

For each city, we store key information including its name, population, and continent. For example:

dragonfly$> HSET city:1 name London population 8.8 continent Europe
dragonfly$> HSET city:2 name Athens population 3.1 continent Europe
dragonfly$> HSET city:3 name Tel-Aviv population 1.3 continent Asia
dragonfly$> HSET city:4 name Hyderabad population 9.8 continent Asia

To build an index, we use the FT.CREATE command.
Firstly, we define the index name and the subset of values to index, such as those with keys prefixed with city:.
And then, we outline our schema attributes:

The name attribute of type TEXT.
The population attribute as a NUMERIC type with sorting enabled.
Finally, the continent attribute as a TAG type. Read more about TAG fields here.

dragonfly$> FT.CREATE cities PREFIX 1 city: SCHEMA name TEXT population NUMERIC SORTABLE continent TAG

After creating the index, the FT.INFO command can be used to inspect its details.
As shown below, the index conforms to the schema we defined, and it contains the hash documents we created earlier:

dragonfly$> FT.INFO cities
1) index_name
2) cities
3) fields
4) 1) 1) identifier
      2) name
      3) attribute
      4) name
      5) type
      6) TEXT
   # schema for 'population' and 'continent' omitted for brevity...
5) num_docs
6) (integer) 4

Moving on to querying!

Our first example query will focus on cities in Europe.
We'll sort them by population in descending order and select only the top one document without skipping any.
The query is also constructed to return only two fields for each result: name and population.

The response contains the total number of documents matched, regardless of the LIMIT option, and the documents themselves.
In this case, only London will be returned, displaying first its key and then the selected fields.

dragonfly$> FT.SEARCH cities '@continent:{Europe}' SORTBY population DESC LIMIT 0 1 RETURN 2 name population
1) (integer) 2 # total number of documents matched
2) "city:1"    # document key (i.e. the key to the HASH document)
3) 1) "name"   # selected fields and their values
   2) "London"
   3) "population"
   4) "8.8"

Our second example query aims to display all cities with a population under 5 million that are situated in Asia as shown below:

dragonfly$> FT.SEARCH cities '@population:[0 5] @continent:{Asia}' RETURN 1 name
1) (integer) 1
2) "city:3"
3) 1) "name"
   2) "Tel-Aviv"

For detailed information on the query syntax, refer to our documentation.

The index is dynamic; it automatically updates as document values are added or removed.
In a later section of this blog post, we will delve into the storage of JSON values.
Contrary to simple hashes, JSON documents can store nested values and arrays, enabling the indexing of more complex data structures.

Vector Search: Finding the Closest Match

After exploring how to create and query indices in the previous chapter, we now turn our attention to the use of the VECTOR field type.
This section will demonstrate building a simple recommendation engine using OpenAI's embeddings.

Vector fields can be used for vector similarity search where the goal is to find documents with vector fields most similar to a given vector.
Vectors are extremely powerful, as they can encode various complex objects like text, images, and music.
The underlying models aim for a fundamental principle: the closer the vectors, the greater the similarity between the original objects.
These vectors are colloquially called embeddings, as they embed the original objects into a vector space.

In the realm of modern applications, vector databases are crucial for executing vector similarity searches.
Our example illustrates building a simple service to recommend blog articles to users based on their interests.
To convert the text of our blog posts into vectors, we'll utilize OpenAI's service.

The preliminary step of gathering all our blog posts along with their embeddings in a CSV file blog-with-embeddings.csv has been completed,
which can be found in our dragonfly-examples repository.
Now, let's begin by loading this file using the pandas Python library.

import pandas as pd

posts = pd.read_csv('blog-with-embeddings.csv', delimiter=',', quotechar='"', converters={'embedding': pd.eval})
posts.head()

The table shows that each document contains a few fields:

The title field is the blog post title.
The content field is the blog post content.
The embedding field is the vectorized content.

The following step involves initializing Dragonfly, then connecting to it using the official Python Redis client to create our index.
We don't need the raw content to be indexed, as we will index the vectorized content instead.

Note that the VectorField constructor accepts additional parameters, such as the algorithm type and the vector dimensions.
FLAT is the selected algorithm type and represents brute-force search.
An alternative, HNSW (Hierarchical Navigable Small World), is also available.
While HNSW can provide approximate results with reduced computational demands,
it consumes more memory and provides faster search speed on larger datasets.

The configuration options also define the vector dimensions, in this case, 1536 dimensions.

import redis
from redis.commands.search.field import TextField, VectorField
from redis.commands.search.indexDefinition import IndexDefinition, IndexType

client = redis.Redis()
client.ft("posts").create_index(
        fields = [TextField("title"), VectorField("embedding", "FLAT", {"DIM": "1536"})],
        definition = IndexDefinition(prefix=["post-"], index_type=IndexType.HASH)
)

Our blog posts are represented using the HASH data type in Dragonfly.
When using hashes, vectors must be encoded in a binary format.
For this purpose, we'll employ the numpy Python library.
It's important to note that Dragonfly currently supports only the float32 data type.
This means each vector should be encoded using 4 bytes per number.

import numpy as np

for i, post in posts.iterrows():
    embedding_bytes = np.array(post['embedding']).astype(np.float32).tobytes()
    client.hset(f"post-{i}", mapping={**post, 'embedding': embedding_bytes})

We've managed to set everything up with just a few lines of code!
The final step involves converting user queries into vectors and then querying Dragonfly with these vectors.
Note that in order to perform the following step, an OpenAI API key is required.
Learn more about obtaining an API key here.

For vector similarity queries, a special syntax is used:

* => [KNN 3 @embedding $query_vector AS vector_score]

The * part represents the filter expression, which can limit the documents considered for the vector similarity search. Using just * selects all documents.
The number 3 specifies that the three closest vectors will be computed.
@embedding denotes the document field where the vectors are stored.
$query_vector is the parameter name containing the target vector.
AS vector_score indicates the name under which the vector distance will be returned.

import openai
from redis.commands.search.query import Query

# How to get an OpenAI API key: https://platform.openai.com/docs/api-reference/introduction
# NOTE: Do not share your API key with anyone, do not commit it to git, do not hardcode it in your code.
openai.api_key = "{YOUR_OPENAI_API_KEY}"
EMBEDDING_MODEL = "text-embedding-ada-002"

# Convert query text to vector using the OpenAI API.
query = "How to switch from a multi node redis setup to Dragonfly"
query_vec = openai.embeddings.create(input=query, model=EMBEDDING_MODEL).data[0].embedding

# Build a search query for Dragonfly.
query_expr = Query("*=>[KNN 3 @embedding $query_vector AS vector_score]").return_fields("title", "vector_score").paging(0, 30)
params = {"query_vector": np.array(query_vec).astype(dtype=np.float32).tobytes()}

# Execute the query and print results.
docs = client.ft("posts").search(query_expr, params).docs
for i, doc in enumerate(docs):
    print(i+1, doc.vector_score, doc.title)

# === Output ===
# 1 0.562158 Zero Downtime Migration from Redis to Dragonfly using Redis Sentinel
# 2 0.568551 Migrating from a Redis Cluster to Dragonfly on a single node
# 3 0.606661 We're Ready for You Now: Dragonfly In-Memory DB Now Supports Replication for High Availability

As shown above, with a few simple steps, we've managed to build a simple recommendation system using Dragonfly Search and OpenAI's embeddings.
Given that LangChain is based on OpenAI and Vector Similarity Search (VSS) technologies, Dragonfly Search is compatible with it as well.
This compatibility enhances the range of applications and functionalities Dragonfly Search can support, tapping into the advanced capabilities of Large Language Models (LLMs).

Querying JSON Documents

In this final part, we demonstrate how to build an issue tracker using Dragonfly.
We'll be using JavaScript, one of the most commonly used programming languages.
To simplify document management, we'll utilize the redis-om-node library, which provides an object-mapping interface for Node.js.
Again, as Dragonfly is highly compatible with Redis, we can use the same library to interact with Dragonfly.

Let's take a look at a sample issue object:

let issue = {
  author: "alice",
  title: "Production error",
  created: 1701203321,
  tags: ["bug", "important"],
  comments: [
    {
      author: "bob",
      text: "Wow, did this really happen?",
      created: 1701203648,
    },
    {
      author: "caren",
      text: "We should fix this immediately!",
      created: 1701203954,
    },
  ],
};

We'll store issue objects like above as JSON values within Dragonfly.
The advantage of indexing JSON values is that a schema field can map to not just a root-level object field, but to an entire JSONPath.
JSONPaths are incredibly useful for selecting values from nested structures and arrays.

Now, let's define our schema using redis-om:

import { createClient } from 'redis';
import { Schema, Repository, EntityId } from 'redis-om';

// Create client and connect to Dragonfly.
const dragonfly = createClient();
await dragonfly.connect();

// Build the schema.
const schema = new Schema(
        "issue",
        {
          author: { type: "string", path: "$.author" },
          title: { type: "text", path: "$.title" },
          created: { type: "number", path: "$.created", sortable: true },

          tags: { type: "string[]", path: "$.tags[*]" },
          participant: { type: "string[]", path: "$..author" },

          num_comments: {
            type: "number",
            path: "length($.comments)",
            sortable: true,
          },
          last_updated: {
            type: "number",
            path: "max($.comments[*].updated)",
            sortable: true,
          },
        },
        { dataStructure: "JSON" },
);

// Build repository using the schema and Dragonfly client.
let issueRepository = new Repository(schema, dragonfly);

// Create index for the repository.
try {
  await issueRepository.createIndex();
} catch (e) {
  console.log(e);
}

// Use the repository to save the 'issue' object we defined earlier into Dragonfly.
await issueRepository.save(issue);

Let's break down the schema definition:

The first few fields, author, title, and created, select values directly from the root-level object using the $.field syntax.
As each post may include multiple tags, the tags field is used to select an array.
To track all participants in an issue, including those who comment, we use the $..author JSONPath. This path selects the author fields from all objects, including comments.
The num_comments and last_updated fields illustrate the usage of simple aggregation functions within JSONPaths.

With the schema in place and a few entries created, we can now leverage the query builder to formulate more intricate queries.

Imagine we want to create a dashboard for Alice's homepage on our issue tracker website.
We can achieve this by selecting all issues authored by alice, tagged as important, and sorting them to display the most recently updated ones first.

// Search for issues:
//  - authored by 'alice'
//  - tagged as 'important'
//  - sort results by 'last_updated'
let issues = await issueRepository
  .search()
  .where("author")
  .equals("alice")
  .where("tags")
  .contains("important")
  .sortDescending("last_updated")
  .return.all();

console.log(issues);

As shown above, with storing JSON documents in Dragonfly, building index schema utilizing JSONPaths,
and using the query builder, we can easily leverage Dragonfly Search capabilities to build applications that require complex data management.

Conclusion

Dragonfly Search represents a significant leap forward in data management and search capabilities for our in-memory data store.
It blends the flexibility of traditional database queries with the advanced features of modern AI technologies.
However, Dragonfly Search is currently in Beta.
As Dragonfly Search progresses, our vision for its evolution is clear and ambitious.
We recognize current limitations as opportunities for growth and innovation:

Faster Updates: Though query performance is robust, we are actively working on speeding up the update process.
GeoSearch: We will support the GEO field type and its related command options.
Command Options: More FT.CREATE and FT.SEARCH options will be supported.
Scoring and Full-Text Search: Implementing scoring mechanisms and full-text search functionalities are key objectives as well.

However, with existing features, we've already seen how Dragonfly Search simplifies complex tasks, from creating efficient indexes to harnessing the power of vector similarity searches with OpenAI embeddings.
Our exploration into using Dragonfly for diverse applications, such as building a recommendation system or an issue tracker, demonstrates its versatility and ease of use.
If you want to learn more about Dragonfly Search, please register for our Community Office Hours, where the team will give a technical presentation and take questions.

And as always, we encourage you to get started, dive in, experiment, and discover the full potential of Dragonfly Search in your own projects.

Appendix - Useful Resources

Dragonfly Search Documentation
Dragonfly v1.13 Release Notes
The OpenAI + vector search example is available in the dragonfly-examples repository.

How We Optimized Dragonfly to Get 30x Throughput with BullMQ

Dragonfly — Thu, 23 Nov 2023 17:00:00 +0000

Howdy! I am Shahar, a software engineer at DragonflyDB.
Today, I'm thrilled to share an exciting journey we embarked on — optimizing Dragonfly for BullMQ, which resulted in a staggering 30x increase in throughput.
While I won't be delving into code snippets here (you can check out all the changes on our GitHub), I'll take you through the high-level optimizations that made this achievement possible.
We believe these enhancements not only mark a significant milestone for Dragonfly but also represent great performance benefits for users of BullMQ.

So, if you're interested in the behind-the-scenes of database performance tuning and what it takes to achieve such a massive throughput improvement, you're in the right place.
Let's get started.

Introduction

BullMQ is a popular, robust, and fast Node.js library for creating and processing background jobs that uses Redis as a backend.
Redis backends are a common choice for frameworks, providing low latency, comprehensive data structures, replication, and other useful features.
However, applications using Redis are limited by its throughput. Dragonfly is a drop-in replacement for Redis designed for much higher throughput and easier deployments.

In a previous blog post,
we announced the full compatibility of Dragonfly with BullMQ and showed how to run BullMQ with Dragonfly efficiently in a few simple steps.
In this post, we'll elaborate the optimizations we made to achieve the stunning 30x performance improvements.

Benchmark Baseline

Before we optimize anything, it's essential to establish a reliable benchmarking baseline.
For BullMQ, which offers its users a robust queue API for adding and processing jobs, our focus was on optimizing the throughput for the add-job operation.
Collaborating closely with the BullMQ team, we developed a benchmarking tool.
This tool is tailored to measure the rate at which messages can be added to queues per second — a crucial metric for understanding the performance of BullMQ while running on different backends.

Our Benchmarking Approach

We focus on adding messages (i.e., add-job operations). The rationale is straightforward: additions are less influenced by the fluctuating state of the queue. Unlike reading from queues, which can be delayed if queues are temporarily empty, adding messages offers a more stable and measurable performance indicator.
To push Dragonfly to its limits, we employed Node.js's worker threads, enabling us to leverage real OS threads to run on multiple CPUs in parallel. This approach simulates a high-load environment more effectively than a single-threaded setup.
Another key aspect of our testing environment was the hardware configuration. We intentionally used a more powerful machine for the client code (running BullMQ) compared to the machine running Dragonfly. This ensures that the client side is never the bottleneck, allowing us to accurately assess Dragonfly's raw performance.

Of course, if your workload requires higher throughput than shown in this post, you could use stronger machines.
Dragonfly is all about scaling, both vertically and horizontally.
As a baseline performance, let's benchmark how many add-jobs/sec Redis can achieve:

Backend	`add-jobs/sec`
Redis 6.2	71,351
Redis 7.2	76,773

Benchmark with Redis Cluster

Scaling in Redis typically involves setting up a Redis Cluster.
Note that minor changes are needed to get the benchmark to work against the Redis Cluster, as one has to use a cluster-aware Redis client.
We conducted our tests using an 8-node Redis Cluster, all hosted on a single machine equipped with 8 CPUs.

Backend	`add-jobs/sec`
Redis 6.2	71,351
Redis 7.2	76,773
Redis Cluster	194,533

As shown above, the Redis Cluster reaches much higher throughput than a single node.

Global Locks

Running BullMQ on Dragonfly doesn't work right away, but with a few steps, you can get it up and running smoothly.
To grasp this integration challenge, let's look at how BullMQ uses Redis.
BullMQ leverages Lua scripts for executing Redis commands efficiently.
These scripts are advantageous because they reduce network latency and allow multiple commands to execute atomically.

However, the way BullMQ executes these scripts presents a hurdle.
Despite Redis requiring that Lua script invocations specify all the keys used by the script, it does not enforce this requirement,
and accessing such undeclared keys from Lua scripts "just works" in Redis.
This flexibility in Redis contrasts sharply with Dragonfly's design.

Dragonfly's architecture is fundamentally different.
It adopts a multi-threaded, shared-nothing approach, meaning keys are spread and not shared across Dragonfly threads.
We have a state of the art transactional framework built on top of this architecture, which allows running operations on keys that belong to different threads.

The challenge arises with Lua scripts from BullMQ that contain undeclared keys.
Dragonfly's mechanism involves locking all declared keys prior to executing a Lua script.
It then strictly prohibits accessing undeclared keys during script execution, as they could be part of other parallel transactions.

Let's consider an example to illustrate the challenge with Dragonfly and undeclared keys.
As shown above, imagine a transaction, Tx1, which is set to access keys key0, key1, and key2.
In Dragonfly's architecture, Tx1 cannot access other keys, such as key3, because it needs to ensure they are not locked by another transaction like Tx2.
Attempting to access an undeclared key like key3 could disrupt the integrity of Tx2.

Accessing undeclared keys is common, and many Redis-based frameworks rely on this practice.
To accommodate this, Dragonfly can be configured to handle these scenarios.
By using the --default_lua_flags=allow-undeclared-keys flag, Dragonfly treats Lua scripts as global transactions.
This means the script has access to the entire datastore, locking it completely for its duration.

However, this solution comes with its own set of challenges.
While it allows scripts to access all keys (including the undeclared ones), it also restricts parallelism.
No other commands or scripts can run alongside a global transaction, even if they involve completely different keys.
Moreover, the situation is further complicated by Dragonfly's multi-threaded nature.
One might think that, using this mode, Dragonfly will have similar performance characteristics to Redis.
However, because keys in Dragonfly are owned by different threads, Dragonfly has to schedule the work between their respective threads (we call these "hops"), which adds significant latency.

Here are the results of running Dragonfly with global locks:

Backend	`add-jobs/sec`	Notes
Redis 6.2	71,351
Redis 7.2	76,773
Redis Cluster	194,533
Dragonfly w/ Global Locks	7,697	Dragonfly Baseline

As demonstrated, using global locks in Dragonfly, while necessary for compatibility with certain Lua scripts, leads to a noticeable drop in throughput.
However, this is just the start of our journey, let's start optimizing!

Rollback Mechanism: A Path Not Taken

One of the initial thoughts we considered was a new approach for running Lua scripts, which we called the "try-lock-rollback" mode.
In this mode, Lua scripts would work normally until they attempted to access an undeclared key.
When they do, Dragonfly will try to lock that key.
If the lock is successful, meaning no other command or script is using the key, the script would proceed as intended.

However, the challenge arises if the lock attempt does not succeed.
If we can't acquire the lock (because another command or script is using the key), Dragonfly would then roll back any changes made by the script so far.
Following the rollback, it would attempt to run the script again from the beginning, this time possibly pre-locking the keys that caused the failure in the previous attempt.

This is an elegant solution that provides a generic way for running all Lua scripts (BullMQ or otherwise), but it has two major disadvantages:

Performance Impact on the Happy Path: Implementing this would require tracking every change made by every script, just in case a rollback is needed. This tracking would be necessary even for scripts that don't access undeclared keys or where undeclared keys are successfully locked. Essentially, this means slowing down the usual, rollback-free operations, which is known as the "common path" or "happy path".
Risk of Snowball Effect & Deadlocks: In scenarios where keys are frequently contended, this approach could lead to a series of rollbacks and retries, creating a bottleneck and significantly impacting performance. Under extreme conditions, this could even lead to deadlocks.

Given these considerable disadvantages, we ultimately decided not to pursue this option.

Hashtags to the Rescue

While we might not know every specific key a script will use, we do know something about their structure.
In the case of BullMQ, all keys share a common string pattern.
Users can define a prefix, like the default bull:, followed by a queue name.
This forms a consistent prefix for all keys related to a specific queue (for instance, bull:queue:123).
Our initial thought was to implement some form of prefix locking based on this pattern.
However, we then considered a more refined solution: hashtags.

Hashtags are a Redis Cluster feature.
They involve wrapping a part of a key in curly braces {}, which ensures that all keys with the same hashtag are located on the same cluster node.
For example, keys {user1}:name and {user1}:email are guaranteed to reside on the same node, allowing them to be efficiently used together in commands or scripts.

Recognizing that BullMQ already utilizes hashtags for Redis Cluster operations, we adopted this concept for Dragonfly as well.
We introduced a new server flag (--lock_on_hashtags) where Dragonfly locks based on the hashtag rather than the entire key.
This approach allows us to maintain the atomicity and isolation of script executions while avoiding the performance penalties associated with global locks or the complexities of the rollback mechanism.

Implementing the hashtag-locking method in Dragonfly has several key advantages:

Ease of Integration for BullMQ: It allows BullMQ to work with Dragonfly by not specifying the exact keys which will be used, but only the queue name itself, which is always known. This simplification greatly streamlines the integration process.
Reduced Cross-Thread Coordination: By ensuring that all keys associated with a particular queue are handled by the same thread in Dragonfly, the need for cross-thread coordination is significantly diminished. We will cover more on this in the following sections.

However, there is a trade-off to consider, which is the limitation on parallelization within a single queue.
While individual Lua scripts run serially, two different scripts can usually run in parallel if they involve keys managed by different threads.
Under the hashtag-locking system, all keys of a specific queue are allocated to the same thread in Dragonfly.
This means that parallel execution of operations within the same queue is not possible.
However, we saw that all common BullMQ operations use a specific subset of keys, so they couldn't be parallelized anyway.

Backend	`add-jobs/sec`	Notes
Redis 6.2	71,351
Redis 7.2	76,773
Redis Cluster	194,533
Dragonfly w/ Global Locks	7,697	Dragonfly Baseline
Dragonfly w/ Hashtag Locks	17,403	~2.26x Dragonfly Baseline

Our first optimization gave us around 126% increase, which is a nice start.
Also, it's notable that we decided to disable hashtag-locking by default, but Dragonfly users can turn it on via the lock_on_hashtags flag.

Reducing Number of Hops

Reducing Number of Hops for Commands

As mentioned a couple of times, Dragonfly is multi-threaded.
We also handle incoming connections using multiple threads, where each connection is assigned a single thread randomly.
This means that when BullMQ connects to a Dragonfly instance with 8 threads, it has a 1/8 chance of "landing" on the thread that owns its queue.
In 7/8 of cases, the connection thread (internally called the coordinator thread) will attempt to run each command on the target thread separately.
A script that tries to run 100 commands will require a "hop" to the target thread to lock the key, another 100 hops to run each of the commands, and another final hop to unlock the key.
Each hop has a latency cost (as well as some minimal coordination overhead), which adds up.
To mitigate that, we added a check to see if all the operations of a script are being done on a single (remote) thread.
If they are, we perform a single hop to the target thread and run the script there.
This turns those 100 hops, 1 per command, to 1 hop for all commands.

Backend	`add-jobs/sec`	Notes
Redis 6.2	71,351
Redis 7.2	76,773
Redis Cluster	194,533
Dragonfly w/ Global Locks	7,697	Dragonfly Baseline
Dragonfly w/ Hashtag Locks	17,403	~2.26x Dragonfly Baseline
Dragonfly w/ Reduced Hops (Commands)	53,011	~6.98x Dragonfly Baseline

With this optimization, Dragonfly achieved another 200% increase on top of hashtag-locking and reached 6.98x the baseline.

Reducing Hops Further for Scripts

After reducing hops for each command, we looked at the rest of the hops.
We had 3 hops for each script invocation, no matter how many commands it issued:

Lock keys.
Run the Lua script to read/modify keys.
Unlock keys.

We then modified our Lua invocation flow to run all steps under a single hop (lock, run, unlock).

Backend	`add-jobs/sec`	Notes
Redis 6.2	71,351
Redis 7.2	76,773
Redis Cluster	194,533
Dragonfly w/ Global Locks	7,697	Dragonfly Baseline
Dragonfly w/ Hashtag Locks	17,403	~2.26x Dragonfly Baseline
Dragonfly w/ Reduced Hops (Commands)	53,011	~6.98x Dragonfly Baseline
Dragonfly w/ Reduced Hops (Scripts)	122,890	~15.97x Dragonfly Baseline

Again, by reducing the number of hops on the script level, we achieved another 132% increase, reaching 15.97x the baseline.

Connection Migration

Previously, I highlighted that, for an 8-thread Dragonfly server, the probability of a connection hitting the target thread is 1/8, and this likelihood decreases as the number of threads increases.
When a connection tries to execute a script (or even a simple command) on a remote thread, it has to request that thread to run some code.
Then it waits for that thread to become free, which may take some time.

To improve this situation, we've developed a connection migration mechanism.
Currently, this feature is specifically tailored to BullMQ, where each queue typically doesn't share its connection with others.
However, it holds potential benefits for other frameworks as well.

Migrating connections to other threads is a subtle process, as Dragonfly uses thread-local variables quite intensively,
but this saves the last hop, getting us to a place where connections seamlessly use their target threads.
We like this feature so much that we even enabled it by default.
It could, however, be disabled by running Dragonfly with --migrate_connections=false.

Backend	`add-jobs/sec`	Notes
Redis 6.2	71,351
Redis 7.2	76,773
Redis Cluster	194,533
Dragonfly w/ Global Locks	7,697	Dragonfly Baseline
Dragonfly w/ Hashtag Locks	17,403	~2.26x Dragonfly Baseline
Dragonfly w/ Reduced Hops (Commands)	53,011	~6.98x Dragonfly Baseline
Dragonfly w/ Reduced Hops (Scripts)	122,890	~15.97x Dragonfly Baseline
Dragonfly w/ Connection Migration	189,756	~24.65x Dragonfly Baseline

Round Robin Key Placement

Here comes the final optimization we have made so far in this journey.
In Dragonfly, the distribution of keys (or queues) across threads is determined by their hash values, akin to a random distribution.
This approach typically ensures an even load distribution when there are many keys, as it balances the computational load across all threads.

However, consider a situation where an 8-thread Dragonfly server is managing just 8 queues.
In an ideal scenario, each thread would handle one queue, leading to a perfectly balanced load and optimal performance.
But due to the random nature of key distribution based on hashing, achieving such an even distribution is very unlikely.
When the distribution of queues across threads is uneven, it results in inefficient use of resources: some threads may be idle while others become bottlenecks, leading to suboptimal performance.

That is exactly why we implemented a very cool feature we call "shard round-robin".
By using --shard_round_robin_prefix=queue, keys that start with queue will be distributed between the threads one by one, guaranteeing a near-even distribution of workloads.
This feature is relatively new, and you should note that:

Currently, this feature is only available for keys using hashtags, so in the example above, the key bull:{queue1} will use round-robin, while queue1 will not.
This feature should usually be disabled. It is useful only in cases of a small number of hashtags (like BullMQ queues) which are highly contended. If you use many keys (like in most Dragonfly use cases), do not use the feature, as it will in fact hurt performance.

By this point, we've achieved a 30x increase in throughput from the baseline, which is a huge improvement!

Backend	`add-jobs/sec`	Notes
Redis 6.2	71,351
Redis 7.2	76,773
Redis Cluster	194,533
Dragonfly w/ Global Locks	7,697	Dragonfly Baseline
Dragonfly w/ Hashtag Locks	17,403	~2.26x Dragonfly Baseline
Dragonfly w/ Reduced Hops (Commands)	53,011	~6.98x Dragonfly Baseline
Dragonfly w/ Reduced Hops (Scripts)	122,890	~15.97x Dragonfly Baseline
Dragonfly w/ Connection Migration	189,756	~24.65x Dragonfly Baseline
Dragonfly w/ Shard Round Robin	253,075	~32.87x Dragonfly Baseline

To put this chart into visualizations, it is notable that Dragonfly outperforms an 8-instance Redis Cluster on the same hardware.
In the meantime, this optimization is much harder to implement using a Redis Cluster, as the key distribution is built into both Redis and Redis Cluster client libraries.
Instead, users of Redis may have to move cluster slots between the nodes to enforce an even distribution.

Conclusion

In this post, we've taken a glimpse into our journey of integrating BullMQ with Dragonfly with a series of optimizations.
Each step on this path was guided by our commitment to achieving exceptional performance, ensuring that Dragonfly stands ready to handle the most demanding loads from BullMQ users.

The journey has been both challenging and rewarding, leading to developments that not only benefit BullMQ but also have the potential to enhance the performance of other Redis-based frameworks.
Dragonfly is committed to embracing the open-source community and broadening the ecosystem.
More integrations and frameworks will be tested with Dragonfly and released in the future.
As always, start trying Dragonfly in just a few steps and build amazing applications!

Appendix - Benchmark Setup & Details

Here are some technical details for those who wish to reproduce the benchmarks:

All benchmarks in this post are done on the same machine, one after the other. We chose an AWS EC2 c7i.2xlarge 8-CPU instance for running Dragonfly or Redis, and a monstrous c7i.16xlarge 64-CPU instance for running BullMQ.
Operating system: Ubuntu 23.04, Linux kernel 6.2.0
All client-side invocations use this tool with the following command, which uses 8 queues and 16 threads on the BullMQ side:

  tsc index.ts && node index.js -h <server_ip> -p 7000 -d 30 -r 0 -w 16 -q 8

Building E-Commerce Applications with Dragonfly

Dragonfly — Tue, 21 Nov 2023 17:00:00 +0000

Introduction

In the high-octane world of e-commerce applications, both response speed and data accuracy are crucial.
Customers expect seamless access to searched items, past orders, recently viewed products, and personalized recommendations.
In the meantime, these applications often experience fluctuating traffic, especially during peak periods like the Christmas season or Black Friday.
High-traffic events furthermore introduce significant challenges, requiring rapid response and precise data handling.
Addressing these variations often demands a scalable and robust in-memory data storage solution.

Dragonfly, an ultra-performant in-memory data store, utilizes a multi-threaded, shared-nothing architecture that pushes hardware to its limits,
supporting up to 4 million ops/sec and 1 TB of memory on a single instance.
This can drastically reduce operational complexities while providing a high-performance solution for e-commerce applications.
For even more demanding scenarios, Dragonfly also offers cluster mode on top of the stunning single-node performance.
This adaptability makes Dragonfly an ideal choice for e-commerce platforms that contend with unpredictable and varied traffic patterns.

In this blog, we will explore how Dragonfly can be used in various ways to elevate your e-commerce platform's performance and user experience, particularly in the following areas:

Caching - String, Hash, and JSON data types are ideal for caching.
Personalization - Sorted-Set is perfect for tracking user preferences.
High-Traffic Flash Sales - Atomic Operations and Distributed Locks can be used to manage inventory verification and deduction under extremely demanding situations.

Caching

Caching is a powerful strategy in web technology, particularly important for e-commerce applications.
It enables quicker data retrieval by storing the results of a database query or an API request in a cache.
When an identical request is made, the data can be swiftly served from the cache (in this case, Dragonfly) avoiding the need for time-consuming interactions with the primary database.

Selecting the appropriate data type for caching is crucial to easing the implementation and optimizing the performance of your e-commerce platform.
The most accessible data type is a String, ideal for caching a blob of data.
It's versatile and safe to store various formats, whether text or binary, like JSON strings, MessagePacks, or Protocol Buffers.
For instance, a user's recent order summary could be cached as a JSON string.
However, the downside of using the String data type is the difficulty in manipulating individual fields within cached data.

# Using the 'String' data type for caching.

# To cache an order:
dragonfly$> SET order_string:<order_id> '{"id": "<order_id>", "items": [{"id": "001", "name": "Laptop", "quantity": 1}], "total": 1799.99}'

# To retrieve the entire cached order:
dragonfly$> GET order_string:<order_id>

Alternatively, a Hash data type, which is a single-level string-to-string flat hashmap, is suitable for storing field/value pairs.
This can be used to cache specific attributes of a user's order, like item IDs and quantities, allowing for quicker access and updates.

# Using the 'Hash' data type for caching.

# To cache quantities for different items and the total price in the order:
dragonfly$> HSET order_hash:<order_id> item_001 5 item_002 6 item_003 7 total 2799.99

# To retrieve the quantity of a specific item:
dragonfly$> HGET order_hash:<order_id> item_003

# To retrieve the entire cached order:
dragonfly$> HGETALL order_hash:<order_id>

Lastly, for more complex data structures, the JSON data type in Dragonfly natively and fully supports the JSON specification
and the JSONPath syntax, enabling easy manipulation of individual fields.
This is particularly useful for detailed order information, where each aspect of an order (such as product details, pricing, and shipping info)
can be individually accessed and modified, providing both flexibility and efficiency in data handling.

# Using the 'JSON' data type for caching.

# To cache an order as native JSON:
dragonfly$> JSON.SET order_json:<order_id> $ '{"id": "<order_id>", "items": [{"id": "001", "name": "Laptop", "quantity": 1}], "total": 1799.99}'

# To update the quantity of a specific item:
dragonfly$> JSON.SET order_json:<order_id> $.items[0].quantity 2

# To retrieve the total price of the order:
dragonfly$> JSON.GET order_json:<order_id> $.total

# To retrieve the entire cached order:
dragonfly$> JSON.GET order_json:<order_id>

For more caching-related techniques, check out our previous blog posts:

Developing with Dragonfly: Cache-Aside to follow along with a step-by-step tutorial on how to implement a cache-aside pattern using Dragonfly.
Developing with Dragonfly: Solve Caching Problems to learn how to solve the 3 common caching problems (Penetration, Breakdown, and Avalanche) with Dragonfly.
Dragonfly Cache Design to learn more about the internal eviction algorithm of Dragonfly.

Personalization

In e-commerce applications, personalizing the user experience is key.
One effective way to achieve this is by showcasing prioritized items, such as the most-viewed product categories for a particular user or the top items viewed globally on the application for the day.
Sorted-Set, a data structure available in Dragonfly, is perfectly suited for this task.
It's a collection of unique elements, each associated with a score, which determines the order of the elements.

For instance, to track a user's most-viewed product categories, we can use a Sorted-Set where each category is a member and the number of times the user views that category is the score.
Every time a user views a category, the score is incremented, ensuring the set always reflects the user's current preferences.

# Using the 'Sorted-Set' data type to track user preferences.

# To increment the view count for a category for a user:
dragonfly$> ZINCRBY viewed_product_categories_by_user_id:<user_id> 1 "electronics"

# To retrieve the top 5 viewed categories for a user:
dragonfly$> ZREVRANGE viewed_product_categories_by_user_id:<user_id> 0 4 WITHSCORES

Similarly, for global views, we can maintain a Sorted-Set for the entire application, where each view of a product category by any user increments the category's score.
This approach allows for dynamic, real-time ranking of categories or items based on popularity, providing valuable insights for both users and the platform.

Sorted-Set an important data structure in many applications, particularly for scenarios like those mentioned above.
Redis has long been celebrated for its robust implementation of this data structure, facilitating efficient data sorting and retrieval.
However, starting from v1.11, Dragonfly introduces a B+ Tree-based implementation.
This new implementation not only enhances performance but also improves memory efficiency in terms of size, making it an excellent choice for handling large-scale data sorting and ranking tasks.
We plan to explore this topic in greater depth in a future blog post.

High-Traffic Flash Sales

In our earlier discussion, we highlighted the challenges e-commerce platforms face with fluctuating traffic, particularly during high-profile events like Black Friday flash sales.
During these peak periods, Dragonfly can play a pivotal role, especially in managing inventory verification and deduction.
While these tasks can be performed using a traditional SQL database, the simultaneous attempts by numerous users to purchase limited-stock items can quickly overwhelm the primary database.
This is where the capabilities of Dragonfly, as an ultra-performant in-memory data store become invaluable.

Dragonfly can address this challenge with two mechanisms: Atomic Operations and Distributed Locks.
Each mechanism can in turn be implemented in different ways, which we will explore in detail below.

1. Atomic Operations with `INCR` or `DECR`

Dragonfly's atomicity ensures that each command, such as incrementing or decrementing a value, is executed entirely and independently, without interference from other operations.
Consider a scenario where we have a limited stock of a product for a flash sale, we can initialize the inventory count in Dragonfly by setting the quantity:

# Assuming the flash sale has 100 units for a particular item.
dragonfly$> SET item_on_sale:<item_id> 100

For simplicity, we assume that each request is for a single unit of the item.
When a purchase request is made, we use the DECR command to deduct inventory atomically:

dragonfly$> DECR item_on_sale:<item_id>

The return value of the DECR command is crucial.
If it is greater than zero, it indicates that the product is still available, and the purchase can proceed.
Conversely, if the return value is zero or less, it signifies that the product is sold out, and further purchases should be denied.
This method is easy to implement and particularly effective for straightforward scenarios where immediate inventory updates are sufficient and more complex processes like order cancellations can be managed later.

2. Atomic Operations with Lua Scripts

For more complex scenarios, Lua scripts can be used to implement atomic inventory verification and deduction.
It is notable that Dragonfly allows non-atomic operations in Lua scripts with the disable-atomicity script flag, as explained in this blog post.
Thus, we need to make sure that the script flag is not used for our e-commerce inventory deduction scenario.
Let's assume that we store the inventory of an item using the Hash data type:

dragonfly$> HSET item_on_sale:<item_id> inventory 100 purchased 0

To verify and deduct inventory, we can use the following Lua script:

-- atomic_inventory_deduction.lua

local key = KEYS[1];
local num_to_purchase = tonumber(ARGV[1]);

if num_to_purchase <= 0 then
   return nil;
end

local item = redis.call("HMGET", key, "inventory", "purchased");
local inventory = tonumber(item[1]);
local purchased = tonumber(item[2]);

if purchased + num_to_purchase <= inventory then
   redis.call("HINCRBY", key, "purchased", num_to_purchase);
   return num_to_purchase;
end

return nil;

In the script above, we first parse the keys and arguments passed to the script.
The script operates on a single key, which is the key of the item on sale.
Similarly, the script expects a single argument, which is the number of units to purchase.
Then, we retrieve the current inventory and the number of units purchased for the item using the HMGET command.
If the total number of units purchased plus the number of units to purchase is less than or equal to the inventory, we increment the number of units purchased and return the number of units purchased.
Otherwise, we return nil to indicate that the purchase cannot proceed.
The script above can be executed using the EVAL command:

# General syntax of the 'EVAL' command:
#   EVAL script num_of_keys [key [key ...]] [arg [arg ...]]

# Try to purchase 5 units of the item:
dragonfly$> EVAL "<script>" 1 item_on_sale:<item_id> 5

Alternatively, we can load the script and use the EVALSHA command, which is more efficient as it stores the script in Dragonfly:

# Load the script into Dragonfly and get the SHA1 digest of the script.
dragonfly$> SCRIPT LOAD "<script>"

# Try to purchase 5 units of the item using the SHA1 digest of the script:
dragonfly$> EVALSHA "<script_sha>" 1 item_on_sale:<item_id> 5

In comparison to the DECR command, the Lua script option allows for more complex inventory verification and deduction logic and covers more edge cases.
For instance, in the script above, we have a sanity check to ensure that the number of units to purchase is greater than zero.
In the meantime, we allow purchases of more than one unit of the item, and it handles the edge case where the inventory is not sufficient to fulfill the requested quantity nicely.

3. Distributed Locks with Conditional `SET`

Another powerful feature of Dragonfly is its ability to act as distributed locks, playing a critical role in handling surges of traffic.
During high traffic periods, such as flash sales, each incoming request attempts to acquire a lock from Dragonfly.
Only the request that successfully secures a lock gains the exclusive right to proceed with further operations for that particular on-sale item.
These operations might include database operations, payment processes, or any other actions that require direct interaction with the primary database or third-party services within the e-commerce platform.

Requests that fail to acquire a lock are denied further processing and can be redirected to a waiting page or a retry page with a countdown timer, depending on the implementation.
This ensures that only one request is allowed to proceed at a time per on-sale item, preventing the primary database from being overwhelmed by excessive simultaneous requests.

A common logic for using distributed locks could be something similar to the following pseudocode:

// purchase_item_pseudocode.js

const itemId = getItemId();
const userId = getUserId();
const expiration = getLockExpirationTime();

const lockKey = `item_lock:${itemId}`;
const lockVal = userId;

const lockAcquired = acquireLock(lockKey, lockVal, expiration);

if (!lockAcquired) {
    sendResponse("Another user got this item, please try again later.");
}

try {
    purchaseItem(itemId, userId);
} catch (purchaseError) {
    throw purchaseError;
} finally {
    releaseLock(lockKey, lockVal);
}

sendResponse("You have successfully purchased the item!");

It is notable that both acquireLock and releaseLock take the item ID and user ID into account.
We want to ensure that an acquired lock cannot be accidentally released by another user under high concurrency situations, the releaseLock implementation should conform to this requirement.
Also, the choice of the lock expiration time is important.
It should be long enough to allow the user to complete the purchase process, but not too long to prevent other users from acquiring the lock if the service process dies unexpectedly without releasing the lock.

The acquireLock function can be implemented in Dragonfly using the SET command with the NX and EX options:

# Using the 'SET' command with the 'NX' option to acquire a lock.
# The 'NX' option ensures that the lock is only acquired if the key does not exist.
# Also, we set the expiration time for the lock to prevent the lock from being held indefinitely.
dragonfly$> SET item_lock:<item_id> <user_id> NX EX <expiration>

On the other hand, the releaseLock function needs to be implemented in a Lua script to ensure that the lock is only released if the user ID matches the one that acquired the lock.
This can be achieved in Dragonfly using the EVAL command with the following Lua script:

-- release_lock.lua

local key = KEYS[1];
local val = ARGV[1];

local lock_val = redis.call("GET", key);

if lock_val == val then
   redis.call("DEL", key);
end

return nil;

4. Distributed Locks with RedLock

Using conditional SET commands and Lua scripts for distributed locking in Dragonfly is a straightforward yet effective way to manage highly concurrent operations.
However, in environments where an even higher level of reliability and fault tolerance is required, particularly across distributed systems, the RedLock distributed lock algorithm adds additional safety.

RedLock is designed to extend the locking mechanism across multiple primary instances of Redis.
Since Dragonfly is highly compatible with Redis, RedLock can be used with Dragonfly instances as well.
RedLock ensures that a lock is acquired and released correctly and consistently across all these instances, enhancing the reliability and integrity of the distributed locking process.

When using RedLock, the acquireLock and releaseLock functions are normally provided by the client library already.
This typically involves attempting to acquire the lock on multiple instances simultaneously and ensuring that a majority of the instances grant the lock before proceeding.
Similarly, the release process involves trying to release the lock across all instances to maintain consistency.

For more information on RedLock, read the documentation here.

Conclusion

In this blog, we explored how Dragonfly can be used in various ways to elevate your e-commerce platform's performance and user experience.
We discussed how Dragonfly can be used for caching, personalization, and high-traffic flash sales.

Overall, Dragonfly is a versatile and powerful tool for building and maintaining an e-commerce platform, proficient in handling various aspects from everyday user interactions to the most demanding sales events.
Although not directly shown in this blog, Dragonfly's performance is phenomenal, as discussed in detail in our previous blog posts.
We encourage you to try Dragonfly out for yourself and experience its capabilities firsthand.
Also, consider subscribing to our newsletter below to stay in the loop with the latest Dragonfly news and updates!

DEV Community: Dragonfly

Dragonfly Cloud: Now Available in AWS Marketplace

Case Study: Migrating from Redis to Dragonfly to Scale IoT Infrastructure

Intro

Redis Provided Speed But Failed to Scale

Dragonfly Was the Better Choice

Success with Dragonfly

A Preview of Dragonfly Cluster

Introduction

Dragonfly Cluster Overview

Cluster Modes

Dragonfly Cluster Management

DFLYCLUSTER Commands

Cluster Creation

Slot Migration

Replicas

Slot Migration Process Under the Hood

Snapshot Creation

Journal Serialization

Conclusion

Dragonfly's New Sorted Set Implementation

Background

TL;DR

When Tax is Bigger Than Payment

Skiplist vs. B-tree

I Know What You Did Last Summer

Benchmark Results

Conclusion

Preview of Dragonfly Cluster

Introduction

Dragonfly Cluster Overview

Cluster Modes

Dragonfly Cluster Management

DFLYCLUSTER Commands

Cluster Creation

Slot Migration

Replicas

Slot Migration Process Under the Hood

Snapshot Creation

Journal Serialization

Conclusion

2024 New Year, New Number: New Benchmarks Show Dragonfly Achieves 6.43 Million Ops/Sec on an AWS Graviton3E Instance

Introduction

Beyond the Number - Dragonfly Advances with Hardware

Dragonfly Architecture Overview

How to Repeat the Benchmark Results

Conclusion

Scaling Real-Time Leaderboards with Dragonfly

Introduction

Implementation

1. Database Schema

2. Dragonfly Keys & Data Types

3. All-Time & Current-Week Leaderboards

4. Leaderboards for Previous Weeks

Other Considerations

1. Calculating the Start of the Week

2. Management of Dragonfly Keys

3. Key Naming Conventions

Conclusion

Dragonfly 2023 in Review and Exciting Glimpses of 2024

Looking Back at 2023

Lightning-Fast Performance and High Availability

Seamless Integrations with Popular Frameworks

Cloud-Native Deployment and Management

AI Revolution

A Glimpse of Dragonfly in 2024

Dragonfly Cloud

Hardware Efficiency and Persistence

Community Centric Development

Meet Us at Events Near You!

Thank You

Using Laravel with Dragonfly

Introduction

Getting Started

Higher Efficiency as a Cache

Memory Usage

Snapshotting

Key Stickiness

Enhanced Throughput in Queue Management

Conclusion

`DFLYCLUSTER` Commands

`DFLYCLUSTER` Commands

1. Atomic Operations with `INCR` or `DECR`

3. Distributed Locks with Conditional `SET`