Mourya Vamsi Modugula

Posted on Jan 12

Load Balancing Explained (Simple Guide for Beginners)

#webdev #programming #systemdesign #beginners

Load balancing is one of those “system design” terms that sounds complex… but the idea is actually very simple.

If a lot of people use your app at the same time, a single server can get overloaded and crash. Load balancing solves this by distributing traffic across multiple servers so the system stays fast and reliable.

This post explains load balancing in a beginner-friendly way — no heavy jargon, no complicated diagrams.

What is Load Balancing?

Load balancing means:

Distributing incoming requests across multiple servers, so no single server becomes a bottleneck.

Instead of sending all users to a single server:

Users → Server 1 (overloaded ❌)

Load balancing routes requests to multiple servers:

Users → Load Balancer → Server 1 ✅
                     → Server 2 ✅
                     → Server 3 ✅

The load balancer is like a traffic manager that sits in front of your servers.

Why Do We Need Load Balancing?

Here are the big reasons:

1) Prevent server overload

If too many requests hit the same server, it can:

slow down
return 500 errors
crash completely

Load balancing spreads traffic out.

2) Improve performance

When requests are distributed evenly:

servers respond faster
users get better experience

3) High availability (fault tolerance)

If one server goes down:

✅ load balancer detects it
✅ stops sending traffic to it
✅ routes users to healthy servers

This is one of the most important benefits.

Types of Load Balancers

There are two common categories:

✅ Hardware load balancer

Physical device used in data centers
Powerful but expensive

✅ Software load balancer (most common today)

A service or program that runs on machines/cloud Examples:
Nginx
HAProxy
AWS Elastic Load Balancer (ELB)
Google Cloud Load Balancer
Azure Load Balancer

Most developers use cloud load balancers.

How Does a Load Balancer Choose a Server?

A load balancer uses routing strategies (algorithms) to decide where each request goes.

Here are the most common ones:

1) Round Robin (simplest)

Requests are sent in rotation:

Req1 → Server 1
Req2 → Server 2
Req3 → Server 3
Req4 → Server 1

✅ Great when servers are identical
❌ Not ideal if one server is slower than others

2) Least Connections (smarter)

Sends requests to the server with the least active connections.

✅ Great when:

requests take different durations
traffic is uneven

Example:

Server 1 has 120 active users
Server 2 has 50 active users

➡️ New request goes to Server 2

3) IP Hash / Sticky routing

Routes the same user to the same server.

✅ Useful for session-based systems
❌ Not always ideal for scaling

A Real Example (Why Load Balancing Matters)

Let’s say:

1 server can handle 1000 users
your app suddenly gets 10,000 users

One server will die.

With load balancing:

you run 10 servers
each handles ~1000 users

✅ app stays stable
✅ users don’t suffer
✅ the system scales

Sessions Problem (Very Important)

Load balancing creates one common issue: sessions.

The issue

If login/session is stored inside server memory:

User logs in → routed to Server 1 ✅
Next request → routed to Server 2 ❌ (Server 2 doesn’t know the session)

So user suddenly feels “logged out”.

Fix options

✅ Option A: Sticky sessions

The load balancer ensures the same user always goes to the same server.

This works, but has drawbacks in large systems.

✅ Option B (best practice): Central session storage

Store sessions in:

Redis
database

Now any server can handle the user.

✅ stateless servers
✅ better scaling
✅ easy server replacement

Health Checks (How LB Detects Broken Servers)

Load balancers constantly check server health using an endpoint like:

/health
/status

Example health check:

GET /health
200 OK

If a server stops responding:

❌ Load balancer removes it temporarily
✅ Sends traffic only to healthy servers

Where Load Balancers Fit in System Architecture

Typical setup looks like this:

Users
  ↓
CDN (optional)
  ↓
Load Balancer
  ↓
App Servers (multiple)
  ↓
Cache (Redis) + Database

This is how modern apps handle scale.

Load Balancer vs API Gateway (Common Confusion)

They sound similar but do different jobs.

✅ Load Balancer

Main job:

distribute traffic between servers

✅ API Gateway