๐๐จ๐ฉ๐ฌ! ๐๐จ๐ญ๐ข๐๐ ๐ฌ๐จ๐ฆ๐ ๐ฌ๐๐ซ๐ฏ๐๐ซ๐ฌ ๐๐ข๐ ๐ฎ๐ง๐๐ฑ๐ฉ๐๐๐ญ๐๐๐ฅ๐ฒ?
๐๐ฎ๐ข๐จ๐ช๐ฏ๐ฆ ๐ข ๐ค๐ฆ๐ญ๐ฆ๐ฃ๐ณ๐ช๐ต๐บ ๐ฑ๐ฐ๐ด๐ต๐ด ๐บ๐ฐ๐ถ๐ณ ๐ธ๐ฆ๐ฃ๐ด๐ช๐ต๐ฆ ๐ญ๐ช๐ฏ๐ฌ. 50,000 ๐ฑ๐ฆ๐ฐ๐ฑ๐ญ๐ฆ ๐ฉ๐ช๐ต ๐บ๐ฐ๐ถ๐ณ ๐ด๐ฆ๐ณ๐ท๐ฆ๐ณ ๐ข๐ต ๐ต๐ฉ๐ฆ ๐ด๐ข๐ฎ๐ฆ ๐ต๐ช๐ฎ๐ฆ ๐ฎ๐ข๐ฌ๐ช๐ฏ๐จ ๐ณ๐ฆ๐ฒ๐ถ๐ฆ๐ด๐ต๐ด. ๐๐ฉ๐ฆ๐ฏ ๐ช๐ฏ ๐ญ๐ฆ๐ด๐ด ๐ต๐ฉ๐ข๐ฏ 10 ๐ฎ๐ช๐ฏ๐ถ๐ต๐ฆ๐ด, ๐ฆ๐ท๐ฆ๐ณ๐บ๐ต๐ฉ๐ช๐ฏ๐จ ๐จ๐ฐ๐ฆ๐ด ๐ฃ๐ญ๐ข๐ฏ๐ฌ!!!! ๐๐ข! ๐๐ฆ๐ณ๐ท๐ฆ๐ณ ๐ต๐ช๐ฎ๐ฆ๐ฅ ๐ฐ๐ถ๐ต. ๐๐ฉ๐ข๐ต ๐ค๐ข๐ฏ ๐ฃ๐ฆ ๐ง๐ณ๐ถ๐ด๐ต๐ณ๐ข๐ต๐ช๐ฏ๐จ. ๐๐ฆ๐ต'๐ด ๐ฅ๐ช๐ท๐ฆ ๐ช๐ฏ๐ต๐ฐ ๐ข ๐ฃ๐ฆ๐ต๐ต๐ฆ๐ณ ๐ด๐ฐ๐ญ๐ถ๐ต๐ช๐ฐ๐ฏ ๐ด๐ฐ ๐ต๐ฉ๐ช๐ด ๐ด๐ฉ๐ฐ๐ถ๐ญ๐ฅ๐ฏ'๐ต ๐ฃ๐ฆ ๐ฉ๐ข๐ฑ๐ฑ๐ฆ๐ฏ๐ช๐ฏ๐จ, ๐ฉ๐ข๐ฉ๐ข ๐
๐๐ก๐ ๐๐จ๐จ๐ซ ๐๐ง๐๐ฅ๐จ๐ ๐ฒ ๐ญ๐ก๐๐ญ ๐๐ฑ๐ฉ๐ฅ๐๐ข๐ง๐ฌ ๐๐ฏ๐๐ซ๐ฒ๐ญ๐ก๐ข๐ง๐
Imagine ten thousand people trying to walk through a single door at the same time. That's what happens to a website when concert tickets go on sale, when iPhone pre-orders open, or when your favorite artist drops surprise merch. The door is your server, and it has a maximum capacity before it breaks.
Now imagine that same crowd, but instead of one door, there are fifty doors, all leading into the same venue. A smart security guard directs people evenly across all fifty doors. Nobody waits long, no door gets overwhelmed, everyone gets in smoothly. ๐๐ก๐๐ญ'๐ฌ ๐ฅ๐จ๐๐ ๐๐๐ฅ๐๐ง๐๐ข๐ง๐ .
๐๐ก๐ ๐ฉ๐ซ๐จ๐๐ฅ๐๐ฆ: ๐ฌ๐๐ซ๐ฏ๐๐ซ๐ฌ ๐ก๐๐ฏ๐ ๐ฅ๐ข๐ฆ๐ข๐ญ๐ฌ
Every server can handle a limited number of requests per second. Maybe yours can handle one thousand requests before it starts slowing down. At two thousand requests, it starts timing out. At three thousand, it crashes completely.
For a small blog with a hundred visitors per day, one server is plenty. But what happens when you go viral? What happens when a celebrity tweets your link? What happens when you're selling limited edition sneakers and fifty thousand people hit your site simultaneously at release time?
๐๐ง๐ ๐ฌ๐๐ซ๐ฏ๐๐ซ ๐ฐ๐ข๐ฅ๐ฅ ๐๐จ๐ฅ๐ฅ๐๐ฉ๐ฌ๐. It's not a question of if, it's when.
You can buy a bigger server (called vertical scaling), but even the biggest servers have limits, and they're expensive. A better solution is to use multiple smaller servers working together (called horizontal scaling with load balancing).
๐๐จ๐ฐ ๐ฅ๐จ๐๐ ๐๐๐ฅ๐๐ง๐๐ข๐ง๐ ๐๐๐ญ๐ฎ๐๐ฅ๐ฅ๐ฒ ๐ฐ๐จ๐ซ๐ค๐ฌ
Instead of having one server handling all traffic, you have multiple servers โ five, ten, fifty, whatever you need โ all capable of doing the same work. In front of these servers sits a ๐ฅ๐จ๐๐ ๐๐๐ฅ๐๐ง๐๐๐ซ, which is like a traffic controller. Every incoming request hits the load balancer first, and the load balancer's job is to decide which server should handle that specific request.
Let's say you have five servers behind a load balancer. When a request comes in, the load balancer might:
- Simply pick the server currently handling the fewest requests (called l๐๐๐ฌ๐ญ ๐๐จ๐ง๐ง๐๐๐ญ๐ข๐จ๐ง๐ฌ method)
- Rotate through servers in order (called ๐ซ๐จ๐ฎ๐ง๐-๐ซ๐จ๐๐ข๐ง)
- Check which server has the lowest CPU usage and send the request there (called l๐๐๐ฌ๐ญ ๐ซ๐๐ฌ๐ฉ๐จ๐ง๐ฌ๐ ๐ญ๐ข๐ฆ๐)
The beautiful part? From the user's perspective, they have no idea this is happening. They visit yoursite.com and get a response. They don't know whether Server 1, Server 3, or Server 5 actually handled their request. ๐๐ ๐ท๐๐๐ ๐๐ผ๐ฟ๐ธ๐.
๐๐๐๐ฅ-๐ฐ๐จ๐ซ๐ฅ๐ ๐๐ฑ๐๐ฆ๐ฉ๐ฅ๐: ๐๐จ๐ฐ ๐๐ก๐จ๐ฉ๐ซ๐ข๐ญ๐ ๐๐ก๐๐๐ค๐จ๐ฎ๐ญ ๐ฐ๐จ๐ซ๐ค๐ฌ
You've been to Shoprite during the weekend rush. There are fifteen checkout counters. Imagine if they only opened one counter and everyone had to queue there. The line would stretch to the back of the store, people would abandon their shopping, and the whole system would collapse.
Instead, they open multiple counters and put someone near the entrance directing people: "Counter five is free, counter eight has a short line." ๐๐ก๐๐ญ ๐ฉ๐๐ซ๐ฌ๐จ๐ง ๐ข๐ฌ ๐ ๐ก๐ฎ๐ฆ๐๐ง ๐ฅ๐จ๐๐ ๐๐๐ฅ๐๐ง๐๐๐ซ. They're distributing the workload (customers) across available resources (cashiers) to prevent any single resource from becoming overwhelmed.
When one cashier is faster than others or when one takes a break, the person directing customers adjusts their strategy. They send more people to the faster cashiers and fewer to the slower ones. This is exactly what digital load balancers do with server traffic.
๐๐ก๐๐ญ ๐ก๐๐ฉ๐ฉ๐๐ง๐ฌ ๐ฐ๐ก๐๐ง ๐ญ๐ก๐ข๐ง๐ ๐ฌ ๐ ๐จ ๐ฐ๐ซ๐จ๐ง๐
Remember when everyone tried to register for COVID vaccines on the NCDC portal or when JAMB registration opens? The sites crashed immediately because they couldn't handle the load. ๐๐ก๐๐ญ'๐ฌ ๐ ๐ฅ๐จ๐๐ ๐๐๐ฅ๐๐ง๐๐ข๐ง๐ ๐๐๐ข๐ฅ๐ฎ๐ซ๐.
Compare that to Amazon during Black Friday sales. Millions of people hit their site simultaneously, and it just works. Amazon has thousands of servers behind sophisticated load balancers. When traffic spikes, their system automatically spins up more servers and the load balancer starts directing traffic to them. When traffic drops, they shut down the extra servers to save costs.
๐ป๐๐ ๐ ๐๐๐๐๐๐๐๐๐ ๐๐๐๐๐๐๐ ๐๐๐๐๐ ๐๐๐๐ ๐๐๐๐๐ ๐๐๐ ๐๐๐๐๐ ๐๐๐๐ ๐๐๐๐๐ ๐๐๐๐๐๐๐๐ ๐๐๐๐๐ ๐๐๐๐๐ ๐ ๐๐๐ ๐๐ ๐๐๐๐๐๐๐ ๐๐๐๐'๐๐ ๐๐๐๐๐๐๐๐๐๐๐ ๐๐๐๐๐๐ ๐๐๐๐ ๐๐๐๐๐๐๐๐๐.
๐๐ข๐๐๐๐ซ๐๐ง๐ญ ๐ญ๐ฒ๐ฉ๐๐ฌ ๐จ๐ ๐ฅ๐จ๐๐ ๐๐๐ฅ๐๐ง๐๐ข๐ง๐
๐๐๐ซ๐๐ฐ๐๐ซ๐ ๐ฅ๐จ๐๐ ๐๐๐ฅ๐๐ง๐๐๐ซ๐ฌ are physical devices that sit in data centers and route traffic. They're expensive but extremely fast. Big companies with their own data centers use these.
๐๐จ๐๐ญ๐ฐ๐๐ซ๐ ๐ฅ๐จ๐๐ ๐๐๐ฅ๐๐ง๐๐๐ซ๐ฌ run as applications on regular servers. Tools like Nginx, HAProxy, and cloud services like AWS Elastic Load Balancer fall into this category. They're cheaper and more flexible than hardware options.
๐๐๐ ๐ฅ๐จ๐๐ ๐๐๐ฅ๐๐ง๐๐ข๐ง๐ uses the domain name system itself to distribute traffic. When someone looks up yoursite.com, the DNS can return different IP addresses for different users, sending some to Server A and others to Server B.
๐๐ฅ๐จ๐๐๐ฅ ๐ฅ๐จ๐๐ ๐๐๐ฅ๐๐ง๐๐ข๐ง๐ distributes traffic across different geographic regions. If you have servers in America, Europe, and Asia, a global load balancer can send American users to American servers and Nigerian users to the closest available server, reducing latency.
๐๐ก๐ ๐ก๐๐๐ฅ๐ญ๐ก ๐๐ก๐๐๐ค ๐ฆ๐๐๐ก๐๐ง๐ข๐ฌ๐ฆ โ ๐ญ๐ก๐ ๐ฌ๐๐๐ซ๐๐ญ ๐ญ๐จ ๐ซ๐๐ฅ๐ข๐๐๐ข๐ฅ๐ข๐ญ๐ฒ
Here's a critical feature: load balancers constantly check if your servers are actually healthy. Every few seconds, the load balancer sends a small test request to each server (called a ๐ก๐๐๐ฅ๐ญ๐ก ๐๐ก๐๐๐ค). If a server doesn't respond or responds with an error, the load balancer marks it as unhealthy and stops sending traffic there.
This is why load-balanced systems are more reliable than single-server setups. If one server crashes, the load balancer notices within seconds and routes all traffic to the remaining healthy servers. ๐๐ฌ๐๐ซ๐ฌ ๐ฆ๐ข๐ ๐ก๐ญ ๐ง๐จ๐ญ ๐๐ฏ๐๐ง ๐ง๐จ๐ญ๐ข๐๐ ๐ญ๐ก๐ ๐๐๐ข๐ฅ๐ฎ๐ซ๐ because their requests just get handled by different servers. Meanwhile, engineers can fix or restart the failed server without taking the whole site offline.
๐๐๐ฌ๐ฌ๐ข๐จ๐ง ๐ฉ๐๐ซ๐ฌ๐ข๐ฌ๐ญ๐๐ง๐๐ โ ๐ญ๐ก๐ ๐ญ๐ซ๐ข๐๐ค๐ฒ ๐ฉ๐๐ซ๐ญ
Load balancing gets complicated when your application needs to remember things about users. If you log into a shopping site and add items to your cart, that information is stored on whichever server handled your requests. But what if your next request goes to a different server? That server doesn't know about your cart.
This is called the ๐ฌ๐๐ฌ๐ฌ๐ข๐จ๐ง ๐ฉ๐๐ซ๐ฌ๐ข๐ฌ๐ญ๐๐ง๐๐ ๐ฉ๐ซ๐จ๐๐ฅ๐๐ฆ or sticky sessions. Solutions include:
๐๐ญ๐ข๐๐ค๐ฒ ๐ฌ๐๐ฌ๐ฌ๐ข๐จ๐ง๐ฌ:The load balancer remembers which server you used and always sends your requests there. This works but reduces flexibility.
๐๐ก๐๐ซ๐๐ ๐ฌ๐๐ฌ๐ฌ๐ข๐จ๐ง ๐ฌ๐ญ๐จ๐ซ๐๐ ๐: All servers store session data in a central database or cache (like Redis) that they all access. This way, any server can handle any user's request because the session data isn't tied to a specific server.
๐๐ญ๐๐ญ๐๐ฅ๐๐ฌ๐ฌ ๐๐๐ฌ๐ข๐ ๐ง: The best solution is making your servers stateless, meaning they don't store user information locally. Instead, authentication tokens or session data get sent with every request. This way, truly any server can handle any request.
๐๐จ๐๐ ๐๐๐ฅ๐๐ง๐๐ข๐ง๐ ๐๐ญ ๐๐ข๐๐๐๐ซ๐๐ง๐ญ ๐ฅ๐๐ฒ๐๐ซ๐ฌ
You can load balance at different levels of your architecture:
๐๐ฉ๐ฉ๐ฅ๐ข๐๐๐ญ๐ข๐จ๐ง ๐ฅ๐จ๐๐ ๐๐๐ฅ๐๐ง๐๐ข๐ง๐ : Distributing web requests across multiple web servers. This is what most people mean when they say load balancing.
๐๐๐ญ๐๐๐๐ฌ๐ ๐ฅ๐จ๐๐ ๐๐๐ฅ๐๐ง๐๐ข๐ง๐ : Distributing read queries across multiple database replicas. Your main database handles writes, but multiple read replicas handle read requests, preventing the main database from being overwhelmed.
๐๐ข๐๐ซ๐จ๐ฌ๐๐ซ๐ฏ๐ข๐๐๐ฌ ๐ฅ๐จ๐๐ ๐๐๐ฅ๐๐ง๐๐ข๐ง๐ : If your app is split into multiple services (user service, payment service, notification service), each service might have multiple instances behind its own load balancer.
๐๐จ๐ง๐ง๐๐๐ญ๐ข๐ง๐ ๐๐ฏ๐๐ซ๐ฒ๐ญ๐ก๐ข๐ง๐ ๐ฐ๐'๐ฏ๐ ๐ฅ๐๐๐ซ๐ง๐๐
Day 1 taught us about geographic distribution with Netflix's Open Connect. Day 2 showed us efficiency through proven technology with WhatsApp's Erlang. Day 3 explained caching. ๐๐จ๐๐ ๐๐๐ฅ๐๐ง๐๐ข๐ง๐ ๐๐จ๐ง๐ง๐๐๐ญ๐ฌ ๐๐ฅ๐ฅ ๐จ๐ ๐ญ๐ก๐๐ฌ๐ ๐๐จ๐ง๐๐๐ฉ๐ญ๐ฌ.
Load balancers can use geographic information to route users to the nearest data center (like Netflix does). They can distribute work efficiently across servers (like WhatsApp's philosophy of doing more with less). They work hand-in-hand with caching because cached content can be served from any server without hitting the database.
๐๐ฒ๐ฌ๐ญ๐๐ฆ ๐๐๐ฌ๐ข๐ ๐ง ๐ข๐ฌ๐ง'๐ญ ๐๐๐จ๐ฎ๐ญ ๐ฎ๐ฌ๐ข๐ง๐ ๐จ๐ง๐ ๐ญ๐๐๐ก๐ง๐ข๐ช๐ฎ๐. ๐๐ญ'๐ฌ ๐๐๐จ๐ฎ๐ญ ๐๐จ๐ฆ๐๐ข๐ง๐ข๐ง๐ ๐ฆ๐ฎ๐ฅ๐ญ๐ข๐ฉ๐ฅ๐ ๐ฌ๐ญ๐ซ๐๐ญ๐๐ ๐ข๐๐ฌ ๐ญ๐จ ๐๐ฎ๐ข๐ฅ๐ ๐ฌ๐จ๐ฆ๐๐ญ๐ก๐ข๐ง๐ ๐ซ๐๐ฅ๐ข๐๐๐ฅ๐ ๐๐ง๐ ๐๐๐ฌ๐ญ.
What this means for your projects
If you're building something small, you probably don't need load balancing yet. But understanding it prepares you for scale. Many cloud providers make it easy to add load balancing when you need it. You can start with one server and add a load balancer plus additional servers when traffic grows.
The mental model matters more than the implementation details. When you're designing any system, ask yourself: ๐๐ก๐๐ญ ๐ก๐๐ฉ๐ฉ๐๐ง๐ฌ ๐ข๐ ๐ญ๐ซ๐๐๐๐ข๐ ๐๐จ๐ฎ๐๐ฅ๐๐ฌ ๐ญ๐จ๐ฆ๐จ๐ซ๐ซ๐จ๐ฐ? ๐๐ก๐๐ญ ๐ข๐ ๐ข๐ญ ๐ข๐ง๐๐ซ๐๐๐ฌ๐๐ฌ ๐ญ๐๐ง ๐ญ๐ข๐ฆ๐๐ฌ? Load balancing is the answer to those questions.
Join the dev Community
๐๐จ๐ฆ๐จ๐ซ๐ซ๐จ๐ฐ (๐๐๐ฒ ๐): How does Google show you search results in 0.3 seconds when they're searching billions of web pages? The secret isn't speed, it's pre-computation. We're breaking down why fast apps don't work harder; they work smarter.
JOIN THE CLASS AND SEE HOW LOAD BALANCING COULD MAKE YOUR APP FASTER AND MORE RELIABLE {https://ssic.ng}
Drop a ๐ฅ if you finally understand why Amazon never crashes, but your favorite local site always does during sales.
Top comments (1)
let's dive deep