DEV Community

Cover image for Load Balancers Explained
Niraj Mourya
Niraj Mourya

Posted on

Load Balancers Explained

When your application starts getting traffic, one server is never enough.

At scale, distributing traffic correctly becomes one of the most important architectural decisions.

In this article, we’ll break down:

  • What a load balancer is
  • Why it’s needed
  • Types of load balancers
  • Algorithms used
  • Health checks
  • Real-world architecture patterns

The Scaling Problem

Imagine:

  • 1 server
  • 500,000 users
  • Peak traffic

Even if the server is powerful, it will eventually hit limits:

  • CPU saturation
  • Memory exhaustion
  • Network bottlenecks

The solution is horizontal scaling.

But how do users know which server to hit?

That’s where load balancers come in.

What Is a Load Balancer?

A load balancer is a system that:

  • Accepts incoming client requests
  • Distributes them across multiple backend servers
  • Ensures no single server is overloaded

Basic architecture:

Clients → Load Balancer → Server Pool
Enter fullscreen mode Exit fullscreen mode

Why Use Load Balancers?

1️⃣ Scalability
Add more servers without changing client logic.

2️⃣ High Availability
If one server fails, traffic is redirected.

3️⃣ Fault Tolerance
Prevents cascading failures.

4️⃣ Zero-Downtime Deployments
You can remove a server from rotation during updates.

Load Balancing Algorithms

Round Robin
Requests distributed sequentially.
Simple, effective for uniform workloads.

Least Connections
Traffic goes to the server with the fewest active connections.
Good for uneven workloads.

IP Hash
Same client IP → same backend server.
Useful for session stickiness.

Layer 4 vs Layer 7 Load Balancing

Layer 4 (Transport Layer)

  • Works at TCP/UDP level
  • Faster
  • Doesn’t inspect HTTP content

Layer 7 (Application Layer)

  • Works at HTTP level
  • Can route based on:
    • URL path
    • Headers
    • Cookies
  • Enables smarter routing

Health Checks

Load balancers constantly monitor backend servers.

If a server:

  • Stops responding
  • Returns errors

It is removed from rotation.

This prevents sending traffic to unhealthy instances.

Real-World Setup

In production systems, architecture often looks like this:

Users
   ↓
Load Balancer
   ↓
Web Servers (Auto-scaled)
   ↓
Database / Cache Layer

Enter fullscreen mode Exit fullscreen mode

In cloud systems:

  • AWS ELB / ALB
  • Google Cloud Load Balancer
  • NGINX
  • HAProxy

Load Balancer vs Reverse Proxy

A reverse proxy:

  • Sits in front of servers
  • Forwards requests

A load balancer:

  • Specifically distributes load

Many tools (like NGINX) do both.

Key Takeaways

  • Load balancers distribute traffic across servers
  • Enable horizontal scaling
  • Improve availability
  • Support multiple routing strategies
  • Critical in system design interviews

Top comments (0)