Load Balancing Algorithms Explained 🚦⚖️

Anjusha Felix — Thu, 07 May 2026 09:57:27 +0000

If you’re building scalable systems, understanding load balancing is essential.

A Load Balancer distributes incoming traffic across multiple servers to:

improve performance
prevent server overload
increase availability
handle scaling efficiently

Here are the most common load balancing algorithms every backend/system design engineer should know 👇

🔄 Round Robin Load Balancing

Round Robin is the simplest and most commonly used load balancing algorithm.

It distributes incoming requests sequentially across servers in a circular order.

Imagine you have 3 backend servers:

Server A
Server B
Server C

A load balancer sits in front of them. When users send requests:

Request 1 → Server A
Request 2 → Server B
Request 3 → Server C
Request 4 → Server A
Request 5 → Server B

The cycle repeats continuously. That’s why it’s called Round Robin.

🔄 Weighted Round Robin Load Balancing

Weighted Round Robin is an improved version of the Round Robin load balancing algorithm where traffic is distributed based on the capacity of each server instead of sending equal requests to all servers.

In simple Round Robin:

A → B → C → repeat

Every server gets the same number of requests.

But in real-world systems, servers are often NOT equal.

Some servers:

have more CPU
more RAM
better processing power
higher network bandwidth

So sending equal traffic becomes inefficient.

That’s where Weighted Round Robin helps.

Each server is assigned a weight based on its capability.

Example:

Server A → Weight 5
Server B → Weight 3
Server C → Weight 1

This means:

Server A receives most requests
Server B receives moderate requests
Server C receives fewer requests

Traffic distribution becomes proportional to server strength.

🔗 Least Connections Load Balancing

Least Connections is a dynamic load balancing algorithm that sends incoming requests to the server with the fewest active connections.

Unlike Round Robin or Weighted Round Robin, it does NOT distribute traffic in fixed order.

Instead, it continuously checks:

Which server is currently least busy?

Then routes the next request there.

Suppose you have 3 servers:

Server A → 120 active connections
Server B → 40 active connections
Server C → 15 active connections

The next request goes to:

Server C

Because it currently has the lowest load.

⚙️ How the Algorithm Works Internally

The load balancer maintains a real-time connection counter.

Example:

Server A = 80 connections
Server B = 22 connections
Server C = 5 connections

Incoming request:

→ Assign to Server C

After assignment:

Server C = 6 connections

This process repeats continuously.

🌐 IP Hash Load Balancing

IP Hash is a load balancing algorithm where the client’s IP address is used to decide which server will handle the request.

Instead of distributing requests randomly or sequentially, the load balancer applies a hashing function on the client IP address.

Example:

Hash(192.168.1.10) → Server B
Hash(192.168.1.11) → Server A

This means the same user usually gets routed to the same server every time.

Suppose your infrastructure has:

Server A
Server B
Server C

When User 1 sends a request:

IP = 101.23.45.10

The load balancer calculates:

Hash(IP) % Number of Servers

Result:

→ Server B

Whenever the same user sends another request:

login
add to cart
payment
profile access

they continue reaching:

Server B

This creates session consistency.

⚡ Least Response Time Load Balancing

Least Response Time is an intelligent load balancing algorithm that routes incoming requests to the server responding the fastest.

Unlike Round Robin, which distributes requests equally, Least Response Time continuously monitors server performance in real time.

It checks:

response latency
server speed
sometimes active connections

Then forwards traffic to the best-performing server.

Suppose you have 3 servers:

Server	Average Response Time
Server A	250ms
Server B	90ms
Server c	40ms

The next request goes to:

Server C
because it is responding the fastest.

🧠 Adaptive Load Balancing

Adaptive Load Balancing is an advanced and intelligent load balancing technique where traffic distribution changes dynamically based on real-time server conditions.

Unlike traditional algorithms such as:

Round Robin
Weighted Round Robin
Least Connections

Adaptive Load Balancing continuously monitors the health and performance of the infrastructure before routing requests.

It considers factors like:

CPU usage
memory consumption
response latency
active connections
server health
network traffic
request failure rate

Then automatically decides:

Which server can currently handle traffic most efficiently?

Suppose you have 3 servers:

Server	CPU Usage	Response Time	Health
Server A	90%	300ms	Healthy
Server B	45%	70ms	Healthy
Server c	20%	40ms	Healthy

Even if Server A is very powerful, it is currently overloaded.

Adaptive Load Balancer automatically routes new traffic to:

Server C

because it has:

low CPU usage
faster response time
better availability

DEV Community: Anjusha Felix

Load Balancing Algorithms Explained 🚦⚖️

🔄 Round Robin Load Balancing

🔄 Weighted Round Robin Load Balancing

🔗 Least Connections Load Balancing

⚙️ How the Algorithm Works Internally

🌐 IP Hash Load Balancing

⚡ Least Response Time Load Balancing

🧠 Adaptive Load Balancing