Niraj Mourya

Posted on Feb 20

Load Balancers Explained

#loadbalancing #scalability #distributedsystems #backendengineering

When your application starts getting traffic, one server is never enough.

At scale, distributing traffic correctly becomes one of the most important architectural decisions.

In this article, we’ll break down:

What a load balancer is
Why it’s needed
Types of load balancers
Algorithms used
Health checks
Real-world architecture patterns

The Scaling Problem

Imagine:

1 server
500,000 users
Peak traffic

Even if the server is powerful, it will eventually hit limits:

CPU saturation
Memory exhaustion
Network bottlenecks

The solution is horizontal scaling.

But how do users know which server to hit?

That’s where load balancers come in.

What Is a Load Balancer?

A load balancer is a system that:

Accepts incoming client requests
Distributes them across multiple backend servers
Ensures no single server is overloaded

Basic architecture:

Clients → Load Balancer → Server Pool

Why Use Load Balancers?

1️⃣ Scalability
Add more servers without changing client logic.

2️⃣ High Availability
If one server fails, traffic is redirected.

3️⃣ Fault Tolerance
Prevents cascading failures.

4️⃣ Zero-Downtime Deployments
You can remove a server from rotation during updates.

Load Balancing Algorithms

Round Robin
Requests distributed sequentially.
Simple, effective for uniform workloads.

Least Connections
Traffic goes to the server with the fewest active connections.
Good for uneven workloads.

IP Hash
Same client IP → same backend server.
Useful for session stickiness.

Layer 4 vs Layer 7 Load Balancing

Layer 4 (Transport Layer)

Works at TCP/UDP level
Faster
Doesn’t inspect HTTP content

Layer 7 (Application Layer)

Works at HTTP level
Can route based on:
- URL path
- Headers
- Cookies
Enables smarter routing

Health Checks

Load balancers constantly monitor backend servers.

If a server:

Stops responding
Returns errors

It is removed from rotation.

This prevents sending traffic to unhealthy instances.

Real-World Setup

In production systems, architecture often looks like this:

Users
   ↓
Load Balancer
   ↓
Web Servers (Auto-scaled)
   ↓
Database / Cache Layer

In cloud systems:

AWS ELB / ALB
Google Cloud Load Balancer
NGINX
HAProxy

Load Balancer vs Reverse Proxy

A reverse proxy:

Sits in front of servers
Forwards requests

A load balancer:

Specifically distributes load

Many tools (like NGINX) do both.

Key Takeaways

Load balancers distribute traffic across servers
Enable horizontal scaling
Improve availability
Support multiple routing strategies
Critical in system design interviews

DEV Community