Balaji Yalla

Posted on May 12

System Design 001: Consistent Hashing Explained

#systemdesign #hashing #loadbalancing #scaling

Consistent Hashing

What Is It?

Consistent Hashing is a distributed hashing scheme that operates independently of the number of servers in a cluster. It ensures that when servers are added or removed, only a minimal number of keys (roughly $K/N$, where $K$ is the number of keys and $N$ is the number of servers) are redistributed.

Understanding Hashing First 🔑

Before diving into consistent hashing, let's understand what hashing is at its core.

1. What is Hashing?

Hashing is the process of taking an input of any size (a word, a file, a video, or an entire database row) and passing it through a mathematical formula (a Hash Function) to produce a fixed-size string of characters or numbers, known as a Hash Value, Hash Code, or simply a Hash.

  ┌───────────────┐        ┌───────────────┐        ┌──────────────┐
  │  Input Data   │ ─────► │ Hash Function │ ─────► │  Hash Value  │
  │ (Any length)  │        └───────────────┘        │ (Fixed size) │
  └───────────────┘                                 └──────────────┘

Real-World Analogy: The Digital Fingerprint 👤

Think of a hash as a digital fingerprint:

Unique Identification: Two completely different files will have entirely different hashes, just as two different people have different fingerprints.
One-Way Street: You cannot reconstruct a person from their fingerprint. Similarly, you cannot reconstruct the original input from its hash.
Fixed Size: Whether you hash the single letter "a" or the entire Wikipedia database, the resulting hash is always a compact, fixed-size code.

2. Key Properties of a Good Hash Function

To be useful in software engineering and system design, a hash function must satisfy several key properties:

Property	What it Means	Why it Matters
Deterministic	The same input always yields the same hash output.	If `hash("alice")` was `9823` today and `4102` tomorrow, we could never locate Alice's data.
Fast Computation	Computing the hash for any given input is highly efficient ($O(1)$).	Systems perform millions of lookups per second; hashing cannot be a bottleneck.
Pre-image Resistance	Given a hash, it is practically impossible to guess the original input.	Critical for security (e.g., storing passwords).
Collision Resistant	Two different inputs should almost never produce the same hash.	Prevents different items from overwriting each other in memory (Hash Collisions).
Avalanche Effect	A tiny change in the input completely and unpredictably changes the output hash.	Changing `"password123"` to `"Password123"` should produce completely unrelated hashes.

3. Hashing in Action: Everyday Examples

We use basic hashing in three main areas of software engineering:

A. Data Integrity (Checksums)

Scenario: You download a 5GB installer. How do you know the file wasn't corrupted or tampered with during the download?
Solution: The host publishes an MD5 or SHA-256 checksum (hash). Once downloaded, you run your file through the same hash function. If your hash matches their checksum, the file is $100\%$ intact.

B. Security (Password Hashing)

Scenario: Databases should never store plain-text passwords. If a hacker breaches the database, they get everything.
Solution: When you sign up, the system hashes your password (using slow, secure algorithms like bcrypt or Argon2) and stores only the hash. When you log in, the system hashes your entered password and compares it to the stored hash.

C. In-Memory Search (Hash Maps / Hash Tables)

Scenario: You have a dictionary of 1,000,000 words. How do you look up a definition in $O(1)$ time?
Solution:
1. Store definitions in an array.
2. To store "apple", compute index = hash("apple") % array_length. Store the definition at array[index].
3. To look up "apple", compute the same index and jump directly to it in memory.

Wait... Is the index guaranteed to be unique? (The Truth About Collisions 💥)

No! It is mathematically impossible for indices to always be unique due to the Pigeonhole Principle (e.g., if you have 10 slots and 11 keys, at least one slot must be shared). When two different keys map to the exact same index, it is called a Hash Collision.

How are Collisions Handled and Stored in Memory? (The Node Chain ⛓️)

Under the hood, instead of storing the raw value directly in the array slot, the array contains pointers/references to Node objects. Each Node is a small package containing:

hash_value (the integer hash code, saved to prevent recomputation)
key (e.g., "apple")
value (e.g., "red fruit")
next (pointer to the next Node in the chain, or null)

When "apple" and "banana" both hash to index 3, the system creates a linked chain:

array:
[0] ──► null
[1] ──► null
[2] ──► null
[3] ──► [ hash: 103 | key: "apple" | value: "red fruit" | next: ────┐ ]
[4] ──► null                                                         │
                                                                     ▼ (linked chain)
                                        [ hash: 243 | key: "banana" | value: "yellow fruit" | next: null ]

How Lookup Works with Collisions:

When you perform map.get("banana"):

Direct Jump: The map calculates index = hash("banana") % 10 $\to$ 3, jumping instantly to array[3].
Sequential Search: It scans the chain at index 3. It checks if the current node's key == "banana". The first node is "apple" (no match), so it follows the next pointer to find the next node: "banana" (match!). It returns "yellow fruit".

4. How Long is a Hash Value?

The length of a hash value is completely independent of the input size and is strictly determined by the specific algorithm you choose. Depending on the use case, hashes are typically represented as either integers or hexadecimal strings (0-9, a-f).

Use Case	Common Algorithm	Hash Bit Size	Output Representation	Real-World Example
System Design / Routing (Ultra-fast, non-cryptographic)	MurmurHash3	32-bit or 128-bit	Integer: Up to $2^{32}-1$ (approx. 4.2B) Hex: 32-character string	Used by Apache Cassandra and DynamoDB to distribute partition keys.
Data Integrity Check (Fast, simple checksums)	MD5	128-bit	Hex: 32-character string (e.g., `1f3870be274f6c49b3e31a0c6728957f`)	File download checksums.
Security & Cryptography (Secure, slower, hard to crack)	SHA-256	256-bit	Hex: 64-character string (e.g., `3a7bd3e239ee2f28...`)	Used in Git commit hashes, SSL/TLS certificates, and Bitcoin.

5. Learning Tip: Treat Hashing as a "Black Box" in System Design 📦

When studying system design or preparing for interviews, you should treat the mathematical internals of hash functions as a complete black box.

[!NOTE]
Interviewers do not care about polynomial divisions, bit-shifting logic, or prime-number multiplication sequences.

Instead, you only need to understand the properties and trade-offs of the box:

Non-Cryptographic vs. Cryptographic: Use cryptographic hashes (like SHA-256) when security is paramount (slower, resource-heavy). Use non-cryptographic hashes (like MurmurHash) when raw speed and throughput are needed (routing, database indexing, load balancing).
Uniformity: You must assume the black box distributes inputs perfectly uniformly across the output space. This is key because uniform distribution guarantees that your servers get equal traffic and you don't end up with overloaded "hotspots."
Determinism: Given input $X$, the output is always $Y$. If it weren't, you would store a user's data on Server A today and never be able to locate it tomorrow.

6. Transitioning to System Design

In a single-machine application, a Hash Map works beautifully because the array of buckets resides in a single system's memory.

But what happens when your data is too big for one machine, and you need to distribute it across multiple servers? This is where distributed hashing and Consistent Hashing come into play!

Why Do You Need It?

The Traditional Modulo Hashing Problem 🌀

In standard load balancing or database sharding, we use a simple modulo-based hashing algorithm to distribute keys across $N$ servers:
$$\text{Server ID} = \text{hash}(\text{key}) \pmod N$$

If you have 5 servers ($N=5$):

hash("user_A") = 100 $\to$ 100 % 5 = 0 (Server 0)
hash("user_B") = 101 $\to$ 101 % 5 = 1 (Server 1)

If you add a 6th server ($N=6$):

hash("user_A") = 100 $\to$ 100 % 6 = 4 (Was Server 0, now Server 4 ❌)
hash("user_B") = 101 $\to$ 101 % 6 = 5 (Was Server 1, now Server 5 ❌)

This change causes a Rehash Storm:

In Cache Clusters (Redis/Memcached): Nearly $80\% - 90\%$ of keys suddenly map to different servers, causing a massive cache miss spike that floods the primary database (Cache Stampede).
In Database Shards: Moving nearly all data to new servers requires intensive network bandwidth and cluster-wide locks.

The Consistent Hashing Algorithm

Consistent hashing solves the rehash storm by mapping both Servers (Nodes) and Keys onto a circular virtual ring called the Hash Ring.

                         0 / 2^32 - 1
                        . - ~ - .
                    '               '  ◄─── Server A (at 100,000)
                  /                   \
                 /     Key 1           \
                |     (hash: 150,000)   |  ───► Maps Clockwise to Server B
                |                       |
                 \                     /
                  \                   /
                    .               .  ◄─── Server B (at 2,000,000)
                        ' - _ _ _ .

1. The Hash Ring

The hash space is represented as a circular ring from $0$ to $2^{32} - 1$ (the maximum value of a 32-bit integer).

2. Placing Servers on the Ring

Each physical server is hashed using its IP address or Hostname and placed on a specific coordinate on the ring:
$$\text{Server Position} = \text{hash}(\text{"192.168.1.100"}) \pmod{2^{32}}$$

3. Placing Keys on the Ring

Incoming keys are hashed using the same function and positioned on the same ring space:
$$\text{Key Position} = \text{hash}(\text{"user-123"}) \pmod{2^{32}}$$

4. Routing Keys to Servers (Clockwise Lookup)

To locate the server for a specific key:

Locate the key's coordinate on the ring.
Traverse the ring clockwise until you encounter the first server node.
Assign the key to that server.
If a key's hash is larger than the highest server hash on the ring, it wraps around back to $0$ and lands on the first server.

How Scaling Works

Adding a Server (Scale Up)

Suppose we add a new server, Server D, between Server A and Server B.

Affected Range: Only the keys whose hashes fall in the range between Server A and Server D are shifted.
Redistribution: These keys, which were previously owned by Server B, are now assigned to Server D.
Result: Servers A and C are completely untouched. Only a tiny fraction ($\approx 1/N$) of total keys are migrated.

Removing a Server (Scale Down / Failure)

Suppose Server B crashes.

Affected Range: Only keys that were assigned to Server B (hashes falling between Server A and Server B) are affected.
Redistribution: These keys fall through clockwise and land on Server C.
Result: No other keys in the cluster are affected or moved.

The Hotspot Problem & Virtual Nodes (VNodes)

The Problem: Non-Uniform Distribution

If servers are placed randomly on the ring, they might land very close to one another.
For example, if Server A is at $10,000$, Server B at $20,000$, and Server C at $3,000,000$:

Server C has to handle keys with hashes from $20,000$ to $3,000,000$ (almost $99\%$ of the ring).
This creates unbalanced segments and hotspot servers.

The Solution: Virtual Nodes (VNodes)

Instead of mapping a physical server to a single coordinate, we map it to multiple virtual nodes (vnodes) (e.g., 100 to 200 per physical server) scattered across the ring.

Physical Server A $\to$ A#1, A#2, ..., A#200
Physical Server B $\to$ B#1, B#2, ..., B#200

   ---[ Ring Interleaving Layout ]---
   (0) ... A#1 ... B#2 ... C#1 ... A#2 ... B#1 ... C#2 ... (2^32-1)

Why VNodes are awesome:

Balanced Distribution: The larger number of points interleaves uniformly across the ring, keeping partition sizes equal.
Graceful Failover: If Physical Server A fails, its virtual nodes (A#1, A#2, etc.) disappear. Because they were scattered, their keys are distributed uniformly to various virtual nodes of multiple other servers, instead of dumping all the load onto a single next-door physical neighbor.

🎯 Core Architectural Insights

These two direct points perfectly summarize when and why we choose one hashing scheme over another:

1. Static Clusters ──► Standard Modulo Hashing Wins 🏆

If the number of servers in your cluster is completely static and guaranteed to never change (no scaling, and failures are handled without cluster re-sharding), standard modulo hashing is superior to consistent hashing.

Why: It distributes the load perfectly uniformly, requires absolute minimum CPU overhead ($O(1)$ lookup), and does not require storing or searching a hash ring in memory.

2. Dynamic Clusters ──► Consistent Hashing + VNodes Wins 🌀

If your cluster size changes frequently (dynamic auto-scaling or frequent node crashes):

Classic Consistent Hashing fails: Placing each server at only one random point leaves massive uneven gaps on the ring, creating heavily overloaded "hotspot" servers.
Virtual Nodes (VNodes) are mandatory: By mapping each physical machine to $100$ to $200$ virtual positions, we interleave the ring boundaries, guaranteeing nearly perfect load distribution and smooth, uniform load redistribution during scaling/failure events.

🏢 Real-World Use Cases

Amazon DynamoDB & Apache Cassandra: Distribute records across database nodes.
Discord Gateways: Direct WebSocket connections and chat channels to separate server nodes dynamically.
Akamai CDN: Distribute web assets across global edge-caching networks uniformly to prevent overloading any edge server.

💡 Interview Tips

Don't jump to Consistent Hashing unless scale modifications are a primary constraint. If the cluster size never changes, standard hashing is faster and simpler.
Mention Virtual Nodes immediately when discussing Consistent Hashing. It shows you understand real-world systems rather than just the theoretical baseline.
Analyze Complexity:
- Lookup Time: $O(\log(N \times V))$ where $N$ is servers and $V$ is vnodes.
- Space Complexity: $O(N \times V)$ to store the ring in memory.

🌟 Summary: The Soul of Consistent Hashing

If you need to summarize the entire essence of Consistent Hashing in one single sentence:

"Consistent Hashing was created to achieve an even load distribution and an easy way to scale clusters dynamically without disrupting existing servers, ensuring that the same user load consistently routes to the same server."

It accomplishes this via three key architectural pillars:

Even Load (The Balance Pillar): Uses Virtual Nodes (VNodes) to statistically guarantee that all servers get an equal share of traffic.
Easy Scaling (The Elasticity Pillar): Guarantees that adding or removing a server only affects a minimal fraction ($\approx 1/N$) of keys, completely preventing cluster-wide "Rehash Storms."
Consistent Mapping (The Determinism Pillar): Ensures that a request for the same key always routes to the exact same server, keeping cache hit rates high and data lookups instant.

YouTube Resources

Gaurav Sen: Consistent Hashing — Essential concept walkthrough.
ByteByteGo: Consistent Hashing Explained — Superb animations on the hash ring mechanics.

DEV Community