DEV Community

Anjusha Felix
Anjusha Felix

Posted on

Load Balancing Algorithms Explained ๐Ÿšฆโš–๏ธ

If youโ€™re building scalable systems, understanding load balancing is essential.

A Load Balancer distributes incoming traffic across multiple servers to:

  • improve performance
  • prevent server overload
  • increase availability
  • handle scaling efficiently

Here are the most common load balancing algorithms every backend/system design engineer should know ๐Ÿ‘‡

๐Ÿ”„ Round Robin Load Balancing

Round Robin is the simplest and most commonly used load balancing algorithm.

It distributes incoming requests sequentially across servers in a circular order.

Imagine you have 3 backend servers:

Server A
Server B
Server C
Enter fullscreen mode Exit fullscreen mode

A load balancer sits in front of them. When users send requests:

Request 1 โ†’ Server A
Request 2 โ†’ Server B
Request 3 โ†’ Server C
Request 4 โ†’ Server A
Request 5 โ†’ Server B
Enter fullscreen mode Exit fullscreen mode

The cycle repeats continuously. Thatโ€™s why itโ€™s called Round Robin.

๐Ÿ”„ Weighted Round Robin Load Balancing

Weighted Round Robin is an improved version of the Round Robin load balancing algorithm where traffic is distributed based on the capacity of each server instead of sending equal requests to all servers.

In simple Round Robin:

A โ†’ B โ†’ C โ†’ repeat
Enter fullscreen mode Exit fullscreen mode

Every server gets the same number of requests.

But in real-world systems, servers are often NOT equal.

Some servers:

  • have more CPU
  • more RAM
  • better processing power
  • higher network bandwidth

So sending equal traffic becomes inefficient.

Thatโ€™s where Weighted Round Robin helps.

Each server is assigned a weight based on its capability.

Example:

Server A โ†’ Weight 5
Server B โ†’ Weight 3
Server C โ†’ Weight 1
Enter fullscreen mode Exit fullscreen mode

This means:

  • Server A receives most requests
  • Server B receives moderate requests
  • Server C receives fewer requests

Traffic distribution becomes proportional to server strength.

๐Ÿ”— Least Connections Load Balancing

Least Connections is a dynamic load balancing algorithm that sends incoming requests to the server with the fewest active connections.

Unlike Round Robin or Weighted Round Robin, it does NOT distribute traffic in fixed order.

Instead, it continuously checks:

Which server is currently least busy?

Then routes the next request there.

Suppose you have 3 servers:

Server A โ†’ 120 active connections
Server B โ†’ 40 active connections
Server C โ†’ 15 active connections
Enter fullscreen mode Exit fullscreen mode

The next request goes to:

Server C

Because it currently has the lowest load.

โš™๏ธ How the Algorithm Works Internally

The load balancer maintains a real-time connection counter.

Example:

Server A = 80 connections
Server B = 22 connections
Server C = 5 connections
Enter fullscreen mode Exit fullscreen mode

Incoming request:

โ†’ Assign to Server C

After assignment:

Server C = 6 connections

This process repeats continuously.

๐ŸŒ IP Hash Load Balancing

IP Hash is a load balancing algorithm where the clientโ€™s IP address is used to decide which server will handle the request.

Instead of distributing requests randomly or sequentially, the load balancer applies a hashing function on the client IP address.

Example:

Hash(192.168.1.10) โ†’ Server B
Hash(192.168.1.11) โ†’ Server A
Enter fullscreen mode Exit fullscreen mode

This means the same user usually gets routed to the same server every time.

Suppose your infrastructure has:

Server A
Server B
Server C
Enter fullscreen mode Exit fullscreen mode

When User 1 sends a request:

IP = 101.23.45.10

The load balancer calculates:

Hash(IP) % Number of Servers

Result:

โ†’ Server B

Whenever the same user sends another request:

  • login
  • add to cart
  • payment
  • profile access

they continue reaching:

Server B

This creates session consistency.

โšก Least Response Time Load Balancing

Least Response Time is an intelligent load balancing algorithm that routes incoming requests to the server responding the fastest.

Unlike Round Robin, which distributes requests equally, Least Response Time continuously monitors server performance in real time.

It checks:

  • response latency
  • server speed
  • sometimes active connections

Then forwards traffic to the best-performing server.

Suppose you have 3 servers:

Server Average Response Time
Server A 250ms
Server B 90ms
Server c 40ms

The next request goes to:

Server C

because it is responding the fastest.

๐Ÿง  Adaptive Load Balancing

Adaptive Load Balancing is an advanced and intelligent load balancing technique where traffic distribution changes dynamically based on real-time server conditions.

Unlike traditional algorithms such as:

  • Round Robin
  • Weighted Round Robin
  • Least Connections

Adaptive Load Balancing continuously monitors the health and performance of the infrastructure before routing requests.

It considers factors like:

  • CPU usage
  • memory consumption
  • response latency
  • active connections
  • server health
  • network traffic
  • request failure rate

Then automatically decides:

Which server can currently handle traffic most efficiently?

Suppose you have 3 servers:

Server CPU Usage Response Time Health
Server A 90% 300ms Healthy
Server B 45% 70ms Healthy
Server c 20% 40ms Healthy

Even if Server A is very powerful, it is currently overloaded.

Adaptive Load Balancer automatically routes new traffic to:

Server C

because it has:

  • low CPU usage
  • faster response time
  • better availability

Top comments (0)