DEV Community

Jaspreet singh
Jaspreet singh

Posted on

HLD Fundamentals #6 : Consistent Hashing: The Smart Way to Scale Distributed Systems

Consistent Hashing Explained: The Smart Way to Scale Distributed Systems

Most system design interviews eventually reach a point where the interviewer asks:

"What happens when a new server is added?"

At first, scaling sounds easy—just add another machine.

But in distributed systems, adding or removing servers can cause massive data movement, cache misses, and performance degradation if data distribution isn't handled correctly.

This is exactly the problem Consistent Hashing solves.

It is one of the most important concepts behind systems like Amazon DynamoDB, Cassandra, Redis Cluster, Memcached, Akamai CDN, and many large-scale distributed databases.

In this article, we'll understand:

  • Why Consistent Hashing exists
  • Problems with traditional hashing
  • How Consistent Hashing works
  • Virtual Nodes
  • Real-world use cases
  • Interview-focused explanations

Why Do We Need Consistent Hashing?

Imagine you have:

  • 3 cache servers
  • Millions of user sessions
  • Data distributed across servers

A simple approach would be:

Server = Hash(Key) % NumberOfServers
Enter fullscreen mode Exit fullscreen mode

Example:

Hash(User123) % 3 = Server 1
Hash(User456) % 3 = Server 2
Hash(User789) % 3 = Server 0
Enter fullscreen mode Exit fullscreen mode

This works perfectly...

Until your traffic increases.

Now you add a fourth server.

Server = Hash(Key) % 4
Enter fullscreen mode Exit fullscreen mode

Suddenly almost every key gets remapped.


The Big Problem

Before:

Hash(User123) % 3 = Server 1
Enter fullscreen mode Exit fullscreen mode

After adding Server 4:

Hash(User123) % 4 = Server 3
Enter fullscreen mode Exit fullscreen mode

The same user now points to a completely different server.

As a result:

  • Cache becomes useless
  • Data must be migrated
  • Massive cache misses occur
  • System performance drops

This process is called Rebalancing.


The Rebalancing Problem

Suppose you have:

100 Million Records
3 Servers
Enter fullscreen mode Exit fullscreen mode

You add a new server.

With modulo-based hashing:

Almost all records need redistribution
Enter fullscreen mode Exit fullscreen mode

That's extremely expensive.

For large-scale systems:

  • Data transfer takes time
  • Network bandwidth increases
  • Cache hit ratio drops
  • User latency increases

We need a better solution.


Enter Consistent Hashing

Consistent Hashing is a technique designed to minimize data movement when servers are added or removed.

Main Goal

Instead of moving all data:

Move only a small portion of data
Enter fullscreen mode Exit fullscreen mode

In a properly balanced system:

Only 1/N of keys are reassigned
Enter fullscreen mode Exit fullscreen mode

Where:

N = Number of Servers
Enter fullscreen mode Exit fullscreen mode

This makes scaling significantly cheaper.


Core Idea: The Hash Ring

Unlike normal hashing, Consistent Hashing places both:

  • Servers
  • Data Keys

on a virtual circular ring.


Step 1: Create a Ring

Imagine a hash space:

0 → 360°
Enter fullscreen mode Exit fullscreen mode

or

0 → 2^32 - 1
Enter fullscreen mode Exit fullscreen mode

Both servers and keys are hashed into this space.

Example:

Server A → 50
Server B → 150
Server C → 300
Enter fullscreen mode Exit fullscreen mode

Ring:

          0
       /     \
    300       50
       \     /
        150
Enter fullscreen mode Exit fullscreen mode

Step 2: Place Keys on the Ring

Suppose:

User1 → 70
User2 → 180
User3 → 320
Enter fullscreen mode Exit fullscreen mode

Now each key searches clockwise for the next available server.


Data Assignment Rule

A key belongs to:

The first server encountered while moving clockwise.

Example:

User1 → 70
Enter fullscreen mode Exit fullscreen mode

Clockwise:

70 → Server B(150)
Enter fullscreen mode Exit fullscreen mode

Assigned to:

Server B
Enter fullscreen mode Exit fullscreen mode

Another Example

User2 → 180
Enter fullscreen mode Exit fullscreen mode

Clockwise:

180 → Server C(300)
Enter fullscreen mode Exit fullscreen mode

Assigned to:

Server C
Enter fullscreen mode Exit fullscreen mode

Another Example

User3 → 320
Enter fullscreen mode Exit fullscreen mode

Clockwise:

320 → wrap around → Server A(50)
Enter fullscreen mode Exit fullscreen mode

Assigned to:

Server A
Enter fullscreen mode Exit fullscreen mode

This wrapping behavior is why it's called a ring.


What Happens When a New Server Is Added?

Suppose a new server appears:

Server D → 220
Enter fullscreen mode Exit fullscreen mode

Before:

Server B(150) → Server C(300)
Enter fullscreen mode Exit fullscreen mode

After:

Server B(150) → Server D(220) → Server C(300)
Enter fullscreen mode Exit fullscreen mode

Only keys between:

150 and 220
Enter fullscreen mode Exit fullscreen mode

move to Server D.

Everything else remains untouched.


Result

Instead of moving:

100% of data
Enter fullscreen mode Exit fullscreen mode

We move roughly:

1/N of data
Enter fullscreen mode Exit fullscreen mode

This is the biggest advantage of Consistent Hashing.


What Happens When a Server Fails?

Suppose:

Server B crashes
Enter fullscreen mode Exit fullscreen mode

Only the keys owned by Server B need reassignment.

Those keys move to the next clockwise server.

Example:

Server B → Down
Enter fullscreen mode Exit fullscreen mode

Keys automatically move to:

Server C
Enter fullscreen mode Exit fullscreen mode

The rest of the ring remains unchanged.

This makes Consistent Hashing naturally fault tolerant.


The Uneven Distribution Problem

There is still one issue.

Suppose servers are placed like:

Server A → 10
Server B → 15
Server C → 250
Enter fullscreen mode Exit fullscreen mode

Now Server C owns a huge portion of the ring.

Result:

Uneven Load Distribution
Enter fullscreen mode Exit fullscreen mode

Some servers become overloaded while others sit idle.


Virtual Nodes (VNodes)

To solve uneven distribution, we use Virtual Nodes.

Instead of placing a server once:

Server A → 50
Enter fullscreen mode Exit fullscreen mode

Place it multiple times:

Server A-1 → 50
Server A-2 → 120
Server A-3 → 280
Enter fullscreen mode Exit fullscreen mode

Similarly:

Server B-1
Server B-2
Server B-3
Enter fullscreen mode Exit fullscreen mode
Server C-1
Server C-2
Server C-3
Enter fullscreen mode Exit fullscreen mode

Now the ring becomes much more balanced.


Why Virtual Nodes Matter

Better Load Balancing

Requests spread more evenly.

Easier Scaling

Adding a new machine impacts smaller portions of the ring.

Better Fault Tolerance

Failure impact is distributed across the cluster.


Real-World Example: Redis Cluster

Imagine:

5 Redis Nodes
Enter fullscreen mode Exit fullscreen mode

User sessions are stored using Consistent Hashing.

When traffic grows:

Add Redis Node 6
Enter fullscreen mode Exit fullscreen mode

Without Consistent Hashing:

Almost entire cache gets reshuffled
Enter fullscreen mode Exit fullscreen mode

With Consistent Hashing:

Only a small subset of keys move
Enter fullscreen mode Exit fullscreen mode

Cache hit rate remains high.

System stays stable.


Real-World Example: Database Sharding

Suppose Amazon stores customer data across many database shards.

Customer IDs are hashed onto a ring.

When storage grows:

Add New Shard
Enter fullscreen mode Exit fullscreen mode

Only nearby data migrates.

No massive rebalancing operation is required.


Advantages vs Disadvantages

Advantages Disadvantages
Minimal data movement More complex than modulo hashing
Easy horizontal scaling Requires hash ring management
High cache hit ratio Virtual nodes add implementation complexity
Fault tolerant Debugging can be harder
Widely used in distributed systems Uneven distribution without VNodes

Interview Questions You Should Be Ready For

Why not use modulo hashing?

Because adding or removing a server changes the modulo value, causing almost all keys to be remapped.


What problem does Consistent Hashing solve?

It minimizes rebalancing when servers are added or removed.


How are keys assigned?

A key is assigned to the first server encountered in the clockwise direction on the hash ring.


What are Virtual Nodes?

Multiple logical representations of the same physical server placed on the ring to improve load distribution.


How much data moves when a server is added?

Approximately:

1/N of the total data
Enter fullscreen mode Exit fullscreen mode

instead of almost all data.


Interview Answer in 60 Seconds

Consistent Hashing is a data distribution technique used in distributed systems to minimize rebalancing when servers are added or removed. Instead of using Hash(Key) % N, both servers and keys are placed on a virtual hash ring. A key is assigned to the next server in the clockwise direction. When a server joins or leaves, only a small subset of keys is reassigned, typically around 1/N of the total data. To ensure even distribution, Virtual Nodes are used, where each physical server is represented multiple times on the ring. It is commonly used in Redis, Cassandra, DynamoDB, Memcached, and large-scale sharded databases.

TL;DR

  • Traditional hashing uses Hash(Key) % N
  • Adding/removing servers causes massive rebalancing
  • Consistent Hashing uses a virtual ring
  • Keys move clockwise to the nearest server
  • Only ~1/N of data moves during scaling
  • Virtual Nodes solve uneven distribution
  • Used in Redis, Cassandra, DynamoDB, Memcached, CDNs, and database sharding
  • One of the most frequently asked distributed systems interview topics

Scaling isn't just about adding servers.

The real challenge is adding servers without moving everything.

That's exactly why Consistent Hashing became a foundational building block of modern distributed systems.

Have you ever implemented Consistent Hashing in a real project, or only discussed it in interviews? Share your experience in the comments.

Top comments (0)