If youโre building scalable systems, understanding load balancing is essential.
A Load Balancer distributes incoming traffic across multiple servers to:
- improve performance
- prevent server overload
- increase availability
- handle scaling efficiently
Here are the most common load balancing algorithms every backend/system design engineer should know ๐
๐ Round Robin Load Balancing
Round Robin is the simplest and most commonly used load balancing algorithm.
It distributes incoming requests sequentially across servers in a circular order.
Imagine you have 3 backend servers:
Server A
Server B
Server C
A load balancer sits in front of them. When users send requests:
Request 1 โ Server A
Request 2 โ Server B
Request 3 โ Server C
Request 4 โ Server A
Request 5 โ Server B
The cycle repeats continuously. Thatโs why itโs called Round Robin.
๐ Weighted Round Robin Load Balancing
Weighted Round Robin is an improved version of the Round Robin load balancing algorithm where traffic is distributed based on the capacity of each server instead of sending equal requests to all servers.
In simple Round Robin:
A โ B โ C โ repeat
Every server gets the same number of requests.
But in real-world systems, servers are often NOT equal.
Some servers:
- have more CPU
- more RAM
- better processing power
- higher network bandwidth
So sending equal traffic becomes inefficient.
Thatโs where Weighted Round Robin helps.
Each server is assigned a weight based on its capability.
Example:
Server A โ Weight 5
Server B โ Weight 3
Server C โ Weight 1
This means:
- Server A receives most requests
- Server B receives moderate requests
- Server C receives fewer requests
Traffic distribution becomes proportional to server strength.
๐ Least Connections Load Balancing
Least Connections is a dynamic load balancing algorithm that sends incoming requests to the server with the fewest active connections.
Unlike Round Robin or Weighted Round Robin, it does NOT distribute traffic in fixed order.
Instead, it continuously checks:
Which server is currently least busy?
Then routes the next request there.
Suppose you have 3 servers:
Server A โ 120 active connections
Server B โ 40 active connections
Server C โ 15 active connections
The next request goes to:
Server C
Because it currently has the lowest load.
โ๏ธ How the Algorithm Works Internally
The load balancer maintains a real-time connection counter.
Example:
Server A = 80 connections
Server B = 22 connections
Server C = 5 connections
Incoming request:
โ Assign to Server C
After assignment:
Server C = 6 connections
This process repeats continuously.
๐ IP Hash Load Balancing
IP Hash is a load balancing algorithm where the clientโs IP address is used to decide which server will handle the request.
Instead of distributing requests randomly or sequentially, the load balancer applies a hashing function on the client IP address.
Example:
Hash(192.168.1.10) โ Server B
Hash(192.168.1.11) โ Server A
This means the same user usually gets routed to the same server every time.
Suppose your infrastructure has:
Server A
Server B
Server C
When User 1 sends a request:
IP = 101.23.45.10
The load balancer calculates:
Hash(IP) % Number of Servers
Result:
โ Server B
Whenever the same user sends another request:
- login
- add to cart
- payment
- profile access
they continue reaching:
Server B
This creates session consistency.
โก Least Response Time Load Balancing
Least Response Time is an intelligent load balancing algorithm that routes incoming requests to the server responding the fastest.
Unlike Round Robin, which distributes requests equally, Least Response Time continuously monitors server performance in real time.
It checks:
- response latency
- server speed
- sometimes active connections
Then forwards traffic to the best-performing server.
Suppose you have 3 servers:
| Server | Average Response Time |
|---|---|
| Server A | 250ms |
| Server B | 90ms |
| Server c | 40ms |
The next request goes to:
Server C
because it is responding the fastest.
๐ง Adaptive Load Balancing
Adaptive Load Balancing is an advanced and intelligent load balancing technique where traffic distribution changes dynamically based on real-time server conditions.
Unlike traditional algorithms such as:
- Round Robin
- Weighted Round Robin
- Least Connections
Adaptive Load Balancing continuously monitors the health and performance of the infrastructure before routing requests.
It considers factors like:
- CPU usage
- memory consumption
- response latency
- active connections
- server health
- network traffic
- request failure rate
Then automatically decides:
Which server can currently handle traffic most efficiently?
Suppose you have 3 servers:
| Server | CPU Usage | Response Time | Health |
|---|---|---|---|
| Server A | 90% | 300ms | Healthy |
| Server B | 45% | 70ms | Healthy |
| Server c | 20% | 40ms | Healthy |
Even if Server A is very powerful, it is currently overloaded.
Adaptive Load Balancer automatically routes new traffic to:
Server C
because it has:
- low CPU usage
- faster response time
- better availability
Top comments (0)