Consistent Hashing: A Resilient Approach to Distributed Storage
Table of Contents
Introduction
The world is generating massive amounts of data every second. Itβs practically impossible to store all of it on a single server. To handle this, we rely on horizontal scaling, where data is distributed across multiple servers.
While this helps with capacity, it also introduces a key challenge:
π How do we distribute and retrieve data efficiently across servers when servers can be added or removed over time?
This is where consistent hashing comes in β a powerful technique widely used in distributed systems, load balancers, databases, and caching systems (e.g., Amazon DynamoDB, Apache Cassandra, Memcached, etc.).
In this article, weβll start with the basic hashing approach, discuss its limitations, and then introduce consistent hashing with examples, diagrams, and enhancements like virtual nodes.
The Problem: Data Distribution with Hashing
Imagine you have n data objects that need to be stored across k servers.
A simple approach is to use a hash function:
serverIndex = hash(key) % k
- Each key is hashed.
- The hash value determines which server stores that key.
Example:
- 4 servers (S0, S1, S2, S3)
- Keys:
A, B, C, D, E, F
If hash(A) % 4 = 2
, then key A is stored on server S2.
β This works fine as long as the number of servers (k) doesnβt change.
The Rehashing Problem
Now imagine:
- Server S3 shuts down
- OR a new server S4 is added
Suddenly, our hashing formula (hash(key) % k
) changes because k
has changed.
π This means almost all keys must be remapped to different servers.
π For large-scale systems with billions of keys, this rehashing operation is extremely expensive.
This problem is called the Rehashing Problem.
Diagram β Normal Hashing vs Server Failure:
Before (k=4): After S3 down (k=3):
hash(key)%4 hash(key)%3
Nearly all keys are redistributed β β Inefficient.
Consistent Hashing to the Rescue
Consistent hashing was introduced to minimize data movement when servers are added or removed.
Instead of mapping keys directly to servers using %
, both keys and servers are hashed into the same hash space (imagine values between 0
and 2^32-1
).
We arrange this hash space into a circle (called a hash ring).
How it Works
- Hash servers onto the ring
- Each server is assigned a position on the ring based on its hash.
- Hash keys onto the ring
- Each key is assigned a position on the ring using the same hash function.
- Assign key to nearest server clockwise
- A key belongs to the first server you encounter clockwise from its hash position.
Example
Suppose we have 3 servers: S1, S2, S3
.
hash(S1) = 10
hash(S2) = 30
hash(S3) = 50
Keys: K1=12
, K2=32
, K3=45
.
Placement:
K1=12 β goes to S2 (30)
K2=32 β goes to S3 (50)
K3=45 β goes to S3 (50)
Diagram β Consistent Hash Ring:
0 ---------------- 100
| |
S1(10) S3(50)
\ /
\ /
K1=12 K3=45
S2(30)
K2=32
Server Removal (Resilience)
If S2 (30) fails:
- Only keys between
S1
andS2
need to be moved toS3
. - All other keys remain unaffected.
β Unlike normal hashing, only a fraction of keys need to be moved.
Server Addition (Scalability)
If a new server S4(40)
is added:
- Keys between
S2(30)
andS4(40)
move to S4. - All other keys remain untouched.
β Adding a server also moves only a small set of keys.
Uneven Distribution Problem
Consistent hashing solves the rehashing problem, but it can still suffer from uneven load distribution.
Why?
- The hash function might assign servers unevenly spaced positions on the ring.
- Some servers may get a large portion of the keys, while others get very few.
Virtual Nodes (VNodes)
To solve uneven load distribution, we introduce virtual nodes.
π Instead of placing each server once on the ring, we place it multiple times (at different hash positions).
- Each server is hashed with multiple hash functions (e.g.,
hash(S1+1)
,hash(S1+2)
, β¦). - These become virtual nodes belonging to the same physical server.
Example
If each server has 3 virtual nodes:
-
S1
β positions at10, 40, 70
-
S2
β positions at20, 50, 80
-
S3
β positions at30, 60, 90
Now keys are distributed more evenly, since each server controls multiple smaller segments of the ring.
β
Virtual nodes smoothen key distribution.
β
Adding/removing servers becomes even more balanced.
Diagram β Virtual Nodes:
Ring with VNodes (10 slots)
S1: {10,40,70}
S2: {20,50,80}
S3: {30,60,90}
Real World Applications
Consistent hashing is used in:
- CDNs (Akamai, Cloudflare, Netflix) β request routing to nearest edge servers.
- Distributed Databases (Cassandra, DynamoDB, Riak) β partitioning and replication.
- Caching Systems (Memcached, Redis Cluster) β distributing keys across cache nodes.
- Load Balancers β evenly distributing traffic to backend servers.
Key Advantages
βοΈ Minimal data movement during server changes
βοΈ Scalable and fault-tolerant
βοΈ Works well with dynamic clusters
βοΈ Virtual nodes fix uneven distribution
Key Trade offs
β Slightly more complex to implement than modulo hashing
β Metadata about servers & virtual nodes must be stored
β Very high number of configuration changes may still cause imbalance
Conclusion
Consistent hashing is one of the cornerstone techniques in distributed system design. By arranging servers and keys on a hash ring, it allows scalable, fault-tolerant, and efficient data distribution with minimal disruption during server changes.
With enhancements like virtual nodes, it ensures even distribution of data, making it the backbone of modern distributed databases, caching systems, and CDNs.
β‘ Next time you look up a video on Netflix or fetch a value from Redis, remember that consistent hashing is working silently in the background to keep things fast and efficient.
More Details:
Get all articles related to system design
Hastag: SystemDesignWithZeeshanAli
Git: https://github.com/ZeeshanAli-0704/SystemDesignWithZeeshanAli
Top comments (0)