Vincent Tommi

Posted on Sep 6

Consistent Hashing Explained: Scaling Distributed Systems Without Chaos day 46 of system design basics

#webdev #programming #systemdesign #discuss

In a distributed system where servers are frequently added or removed, efficiently routing requests becomes a major challenge.

A common approach is to hash the request and assign it to a server using:

Hash(key) mod N

where N is the number of servers.

The problem? This method is tightly coupled to N. Any change (adding or removing a server) triggers massive rehashing, redistributing most of the requests — not great for scalability or stability.

Popularized by Amazon’s Dynamo paper, consistent hashing is now a fundamental technique used in distributed databases like DynamoDB, Cassandra, and ScyllaDB.

In this article, we’ll explore:

Why traditional hashing fails at scale
How consistent hashing works
The role of virtual nodes
A Python implementation

The Problem with Traditional Hashing

Imagine you’re running a web app serving millions of users. You distribute traffic across 5 servers:

*Everything Works Fine… Until You Scale *
Scenario 1: Adding a New Server (S5)

Now you have 6 servers. Hashing changes from mod 5 → mod 6.
Result: most users are reassigned to different servers.

Scenario 2: Removing a Server (S4)

Servers drop from 5 → 4. Hashing changes from mod 5 → mod 4.
Again, most users get reassigned.

This causes:

Session loss (users logged out)
Cache invalidation (wasted memory, extra DB load)
Performance degradation

Clearly, we need something better.

The Solution: Consistent Hashing Consistent hashing solves this problem by minimizing key movement when nodes are added/removed.

Instead of mapping key mod N, consistent hashing uses a circular hash space (hash ring):

1 Both servers and keys are hashed onto the same ring.

2 A key is assigned to the next server clockwise on the ring.

This means when a server is added or removed, only a small portion of keys are affected.

Here’s a diagram of a consistent hashing ring showing how keys map to servers.

2.1 Constructing the Hash Ring

Define the hash space: A large, fixed space, e.g., 0 → 2^32 - 1.

Place servers on the ring: Hash(server_id) decides each server’s position.

Place keys on the ring: Hash(key) gives each request/data item a position.

Assignment: Move clockwise until you hit the next server.

2.2 Adding a New Server

When a new server S5 joins:

It’s placed on the ring.

It takes over only the keys between its position and its predecessor.

Only those keys move; everything else stays the same.

Here’s a before-and-after diagram showing how consistent hashing works when a new server (S5) is added.

2.3 Removing a Server

When S4 fails:

Its keys are reassigned to the next server clockwise (S3).

Only those keys move — minimal disruption.

Virtual Nodes (VNodes)

Basic consistent hashing can still cause uneven distribution when:

Few servers exist

Nodes cluster too closely

A node failure shifts too much load

Solution: Virtual Nodes.

Instead of one position per server, each server is mapped multiple times (e.g., S1-1, S1-2, S1-3), spreading load evenly.

If S1 fails, its load is shared across many nodes, preventing hot spots.

Python Implementation

Here’s a simplified version of consistent hashing with virtual nodes:

import hashlib
import bisect

class ConsistentHashing:
    def __init__(self, servers=None, num_replicas=3):
        self.num_replicas = num_replicas
        self.ring = {}
        self.sorted_keys = []
        self.servers = set()
        if servers:
            for server in servers:
                self.add_server(server)

    def _hash(self, key):
        return int(hashlib.md5(key.encode('utf-8')).hexdigest(), 16)

    def add_server(self, server):
        self.servers.add(server)
        for i in range(self.num_replicas):
            key = self._hash(f"{server}-{i}")
            self.ring[key] = server
            bisect.insort(self.sorted_keys, key)

    def remove_server(self, server):
        self.servers.discard(server)
        for i in range(self.num_replicas):
            key = self._hash(f"{server}-{i}")
            if key in self.ring:
                del self.ring[key]
                self.sorted_keys.remove(key)

    def get_server(self, key):
        if not self.ring:
            return None
        key_hash = self._hash(key)
        idx = bisect.bisect(self.sorted_keys, key_hash) % len(self.sorted_keys)
        return self.ring[self.sorted_keys[idx]]

Final Thoughts

Consistent hashing is one of those beautifully simple but powerful ideas that make distributed systems reliable and scalable.

It powers many of today’s biggest databases and caching systems, ensuring smooth operation even when servers are frequently added or removed.

If you’re building systems that need to scale dynamically, consistent hashing should absolutely be in your toolkit.

DEV Community

Consistent Hashing Explained: Scaling Distributed Systems Without Chaos day 46 of system design basics

Top comments (0)