When your application starts getting traffic, one server is never enough.
At scale, distributing traffic correctly becomes one of the most important architectural decisions.
In this article, we’ll break down:
- What a load balancer is
- Why it’s needed
- Types of load balancers
- Algorithms used
- Health checks
- Real-world architecture patterns
The Scaling Problem
Imagine:
- 1 server
- 500,000 users
- Peak traffic
Even if the server is powerful, it will eventually hit limits:
- CPU saturation
- Memory exhaustion
- Network bottlenecks
The solution is horizontal scaling.
But how do users know which server to hit?
That’s where load balancers come in.
What Is a Load Balancer?
A load balancer is a system that:
- Accepts incoming client requests
- Distributes them across multiple backend servers
- Ensures no single server is overloaded
Basic architecture:
Clients → Load Balancer → Server Pool
Why Use Load Balancers?
1️⃣ Scalability
Add more servers without changing client logic.
2️⃣ High Availability
If one server fails, traffic is redirected.
3️⃣ Fault Tolerance
Prevents cascading failures.
4️⃣ Zero-Downtime Deployments
You can remove a server from rotation during updates.
Load Balancing Algorithms
Round Robin
Requests distributed sequentially.
Simple, effective for uniform workloads.
Least Connections
Traffic goes to the server with the fewest active connections.
Good for uneven workloads.
IP Hash
Same client IP → same backend server.
Useful for session stickiness.
Layer 4 vs Layer 7 Load Balancing
Layer 4 (Transport Layer)
- Works at TCP/UDP level
- Faster
- Doesn’t inspect HTTP content
Layer 7 (Application Layer)
- Works at HTTP level
- Can route based on:
- URL path
- Headers
- Cookies
- Enables smarter routing
Health Checks
Load balancers constantly monitor backend servers.
If a server:
- Stops responding
- Returns errors
It is removed from rotation.
This prevents sending traffic to unhealthy instances.
Real-World Setup
In production systems, architecture often looks like this:
Users
↓
Load Balancer
↓
Web Servers (Auto-scaled)
↓
Database / Cache Layer
In cloud systems:
- AWS ELB / ALB
- Google Cloud Load Balancer
- NGINX
- HAProxy
Load Balancer vs Reverse Proxy
A reverse proxy:
- Sits in front of servers
- Forwards requests
A load balancer:
- Specifically distributes load
Many tools (like NGINX) do both.
Key Takeaways
- Load balancers distribute traffic across servers
- Enable horizontal scaling
- Improve availability
- Support multiple routing strategies
- Critical in system design interviews
Top comments (0)