Problem Statement
We need to distribute keys (data, requests, cache entries) across multiple servers.
A common approach is:
bucket = hash(key) % N
Where N is the number of servers.
The Problem?
When N changes (a server is added or removed), almost all keys get reassigned.
This causes?
- Massive data movement
- Cache invalidation
- Network overhead
- We need a way to minimize key movement when servers change.
- The Solution: Consistent Hashing
- Instead of modulo arithmetic, consistent hashing:
- Maps both servers and keys onto a circular hash space (a ring).
- Assigns each key to the next server clockwise.
This small change ensures that when servers are added or removed, only a small subset of keys move.
Example
Step 1 — Create the Hash Ring
Assume hash space = 0 to 99 (circular).
Place 3 servers:
A → 10
B → 40
C → 80
0 ----10(A)----40(B)----80(C)----99 -> back to 0
Step 2 — Add Keys
Key Hash
K1 5
K2 25
K3 50
K4 85
Assignment Rule:
Move clockwise → first server you hit owns the key.
Result:
K1 (5) → A (10)
K2 (25) → B (40)
K3 (50) → C (80)
K4 (85) → wrap → A (10)
Distribution:
A → K1, K4
B → K2
C → K3
Removing a Server
Remove Server B (40).
New ring:
0 ----10(A)---------80(C)----99
Which keys move?
Only keys that belonged to B.
That is:
K2
Reassign:
K2 (25) → next clockwise → C (80)
New distribution:
A → K1, K4
C → K2, K3
Only one key moved.
Adding a Server
Add Server D → 30
New ring:
0 ----10(A)----30(D)----80(C)----99
Which keys are affected?
Only keys between 10 and 30.
That is:
K2 (25)
K2 moves from C → D.
New distribution:
A → K1, K4
D → K2
C → K3
Again, only one key moved.
Why Consistent Hashing Is Better Than Traditional Hashing
Traditional Hashing ?
hash(key) % N
If N changes (e.g., 3 → 4 servers):
- Almost all keys get reassigned.
- Massive reshuffling occurs.
Consistent Hashing
When a server is:
- Removed → only its keys move.
- Added → only nearby keys move.
- Movement is proportional to 1 / number of servers, not the entire dataset.
Takeaway
Consistent hashing minimizes data movement by placing servers and keys on a circular hash space and assigning each key to the next server clockwise.
That design makes distributed systems scalable and stable.
Top comments (0)