Every Request in Your App Passes Through This (and You Ignore It)
You add more servers
and expect your app to scale
It doesn’t
Because scaling isn’t about servers
it’s about how traffic is distributed
What actually happens
User → request → ???
Without a load balancer:
- Some servers get overloaded
- Others stay idle
- Latency spikes
- System crashes
Load balancer = traffic controller
It sits in front of your backend
and decides:
"Which server should handle this request?"
Common strategies
Round Robin
- Distribute requests evenly
- Works if all servers are equal
Least Connections
- Send traffic to least busy server
- Better for real-world usage
IP Hash
- Same user → same server
- Useful for sessions
The real issue
Not all requests are equal
One request = 10ms
Another = 2 seconds
If distribution is naive
your system still breaks
Types
Layer 4
- Fast
- Based on IP/port
Layer 7
- Smarter
- Routes based on URL, headers
Example:
- /api → backend
- /images → CDN
Things most devs miss
- No health checks → dead server still gets traffic
- No failover → single point of failure
- Assuming more servers = scale
Reality
Scaling isn’t adding machines
It’s controlling traffic intelligently
That’s where systems either survive
or collapse
Top comments (0)