In this article, we’ll walk through the technical journey of scaling a viral app, facing the challenges of explosive growth, and a range of approaches to scaling your infrastructure and managing traffic effectively.
By the end of this article, you’ll understand how to solve the complex problem of data distribution and server scaling when dealing with massive, unpredictable traffic. Let’s dive in!
The Calm Before the Chaos
Picture this: after months of relentless effort, your app, a platform for sharing short, funny videos, has finally launched. It’s your dream come true.
The first six month is steady. A few thousand downloads trickle in, users love the app, and everything runs smoothly. The server hums along like a well-oiled machine. Life is good.
Then, the unimaginable happens. A very influential and powerful celebrity, Melon Musk, stumbles upon your app, shares a video, and in a matter of hours, your app becomes the hottest thing on the internet.
Downloads skyrocket. Millions of users sign up overnight, uploading videos, liking, and commenting at a pace you never anticipated. You are on cloud nine. Your servers, though, are not celebrating—they're struggling, then failing. Boom. The app is down.
You’re no longer living the dream. You’re in a nightmare.
The First Fix: Peace Before the Storm
Desperate to bring your app back online, you gather your team. After some time, the issue is spotted. The issue is clear: your database server simply cannot handle the flood of requests.
What is the first and fastest solution? Make the server bigger. You decide to scale your server vertically.
You upgrade everything: CPU, memory, storage. Now, the server is more powerful and fast. After a night of frantic work, the app is back. The users are happy. Problem solved. Life is at peace? Not quite.
Two days later, traffic doubles again. The server, despite its upgrades, crashes under the relentless load. You realize you’ve hit the limits of vertical scaling.
Why Vertical Scaling Fails
- Hardware Limits: No matter how powerful, a single server has a ceiling.
- Cost: Upgrading to high-end hardware is prohibitively expensive.
- Single Point of Failure: One server means one failure point. If it goes down, your entire app goes offline.
You need a more robust solution.
A Bold Step Forward: Dividing the Load
It’s time to think differently. Your team suggests horizontal scaling: instead of relying on a single powerful server, distribute the load across multiple smaller servers.
Each server will handle a portion of the data, reducing the burden on any one machine.
You think about implement this, and for a while, it seems like a breakthrough.
New Problems, New Mysteries
Horizontal Scaling is promising solution. But it also introduces new challenges as traffic grows. In which server data is stored? as we have now multiple instances of server. In which cluster a particular user's data is stored?
Imagine a user uploads a video. Which server should store it? Later, when someone wants to watch it, how will your system know where to look?
Your team tries a Simple and easy approach.
The Brute Force Method
You decided to use a brute force approach and successfully scaled your server vertically. However, within hours, you received complaints that your app was very slow. This time you know the issue as each server is searched until the data is found. While this approach is simple, it doesn't scale. As the number of servers grows, the time required to locate the data becomes unbearably long, leading to user complaints about slow performance.
Clearly, brute force isn’t the answer. You need a smarter way to decide where data belongs.
Hashing: A Glimmer of Hope
You need a way to store data distributed across multiple servers and also able to find that without searching every server.
For that you need to know which server to store the data and where to get the server when read operation.
Your team proposes to use a mathematical function to organize data. It is simple math function that takes data and output the server number where data need to be stored.
You came up with basic math function : Modulus.
How It works:
Suppose you have three servers and this modulus function transforms a key (like a userId or VideoId) into number. That number is used to decide which server stores the data.
Example:
You have three servers: Server0, Server1, and Server2.
A video ID is mapped to a value, e.g., hash(5) = Server2.To decide the server, use 5 % 3 = 2. The data is stored on Server Server0.
Below table depicts which data is mapped to which table.
When you need the data, get the server by using the key and modulus function again and go directly to the correct server. No more searching through every server!
Now this seems propitious, and you proceed to launch this on live. But, after this many emotional roller-coaster you are ready for something bad. However, this time everything is perfect, not any issue.
A month passed and fortunately, everything is going smoothly. Your app continues to grow steadily, with new users joining at a consistent rate. It’s also gaining more popularity, and the average time users spend on the app is steadily increasing. Things are looking promising, and the future holds even more potential.
is this calm before storm?
The Growth Dilemma: More Servers, More Problems
As your app continues to grow, you realize you need to add another server. But this brings an unexpected problem.
What Happens When You Add a Server?
The hash function is based on the number of servers. Adding a new server changes the total, which means every hash value changes. Data previously stored on Server2 might now belong on Server0.
Example:
When you previously have three servers: Server0, Server1, and Server2.
A video ID is hashed to produce a value, e.g., hash(5) = Server2.
To decide the server, use 5 % 3 = 2. The data is stored on Server Server0.
Now you have four servers: Server0, Server1, Server2, Server3.
A video with Id 5 is previously mapped to 5%3 = 2.
But, Now , that same video with Id 5 is mapped to 5%4 = 1.
That means data with video Id 5 need to move to Server1.
Adding server leads to inconsistent data across servers. For our case more than 80% of data is in wrong server. So, we have to move data to its relevant server. This process is chaotic, slow, and prone to errors.
By every time new server is added or removed, we have to move our data to its relevant server and this process is time consuming and can risk of data loss.
So, we need some better approach that way our functions output keeps consistent irrespective of number of servers. We need to change modulus function to some other consistent function and also need to consider that moving of data should be minimal.
Consistent Hashing: The Turning Point
Now, you need a function that maps an input to an output and consistently produces the same output for the same input. Does that sound familiar? Yes, you’re right—we need a hash function, such as MD5 or SHA.
So, the problem of changing outputs with changes in the number of servers is solved. But now, a new question arises: how do we manage the addition or removal of nodes while ensuring minimal data movement?
You team came up with an idea- Consistent hashing.
Consider a system where the hash ring ranges from 0 to 1023, and four servers—server0, server1, server2, and server3—are already present. Their positions on the hash ring might look like this after applying the hash function:
server0 → Position 100
server1 → Position 300
server2 → Position 600
server3 → Position 900
Data is also hashed and placed on the ring. For example:
dataA → Position 150 → Stored on server1 (next clockwise server)
dataB → Position 700 → Stored on server3
dataC → Position 50 → Stored on server0.
We can represent that as array.
server0 | server1 | server2 | server3 |
---|---|---|---|
100 | 300 | 600 | 900 |
How to add a new server
If you need a new server, we need Its position in ring and data that needs to be moved in new server. No need to move almost all the data.
Suppose new server Server4 is added then.
- Hash new server Server4 For example, Hash(Server4) = 200.
- Moves corresponding data to new server.
server0 | server4 | server1 | server2 | server3 |
---|---|---|---|---|
100 | 200 | 300 | 600 | 900 |
Can you see a problem in it?
At first glance, it seems like all things are perfect and there is not any problem with consistent hashing. But there is one. We give our main control to hashing function which basically is pseudo-random function. So, we do not have any control over it, and it can generate new server's position such that it is not impacting any performance improvement.
Suppose server3 is overloaded and nearing a red zone where it risks going down, you try to add a new server, but hash function places it at 1000.It is impact less. It should be added between 600 and 900.So we can balance load.
If there is a problem, then there is a solution also.
The problem with hash function is its randomness, it is needed for data but for server we cannot rely on it as it can be insignificant as we see in above example.
Our server location is already stored in array. Can't you modify it and add or remove server as we need. Yes, you can. Now, control is again on your hand.
We can add or remove server exactly where it needed.
server0 | server4 | server1 | server2 | server5 | server3 |
---|---|---|---|---|---|
100 | 200 | 300 | 600 | 700 | 900 |
Scaling Without Tears
Now whole scaling problem narrow downs to simple hashing and array problem.
Adding or removing servers becomes seamless with consistent hashing.
When a server is added, only data in its immediate range is moved.
When a server is removed, its data is distributed among its neighbors.
This makes scaling fast, efficient, and error-free.
What’s Next? Comments, Feedback, and Reviews
I love to hear your thoughts on this topic. Have you faced similar scaling challenges in your apps? How did you overcome them, and how do you see consistent hashing fitting into your solutions? Feel free to share your experiences in the comments below or leave a review. Your feedback helps me keep improving the content and ensure that I am providing value to the community.
Conclusion: A Scalable Future
Consistent hashing isn’t just an optimization; it’s a mindset. It’s about building systems that adapt gracefully to growth and change.
Your viral app, once brought to its knees by traffic, is now ready for anything the internet throws at it. With consistent hashing, you’ve turned chaos into order and laid the foundation for a truly scalable platform.
And who knows? With these tools, your next big idea might just be the next viral sensation.
Top comments (0)