Jaimin Bariya

Posted on Nov 7

Load Balancing Algorithms with Examples

These algorithms are critical in distributing network or application traffic across multiple servers, improving efficiency, reducing latency, and providing fault tolerance. Each algorithm has unique features tailored for specific scenarios, and choosing the right one depends on the application requirements and the server infrastructure. Here’s a detailed breakdown of popular load-balancing algorithms:

1. Round Robin

How It Works: Requests are distributed to each server in a circular sequence. If you have three servers, the first request goes to Server 1, the second to Server 2, the third to Server 3, and then it starts over with Server 1.
Advantages:
- Simple and Fair: Ensures all servers receive an equal share of requests over time.
- Easy to Implement: Very basic to configure.
Disadvantages:
- Doesn’t consider the current load on each server. If one server is faster than others or has fewer resources, it might not handle its share as efficiently.

2. Least Connections

How It Works: The load balancer sends requests to the server with the fewest active connections. This approach assumes that fewer active connections correlate to lower load.
Advantages:
- Great for servers with varying capacities or where each request consumes a different amount of resources.
- Keeps the servers more balanced in terms of actual load, rather than just counting requests.
Disadvantages:
- Can be more resource-intensive to calculate connections regularly, which may slightly increase overhead.

3. IP Hashing

How It Works: Uses a hash function based on the client’s IP address to assign requests to a specific server. This way, requests from a particular client will always go to the same server.
Advantages:
- Sticky Sessions: Helps ensure that a specific client (or session) is consistently handled by the same server, which is useful for applications that maintain user state.
- Good for consistent routing where data locality is important.
Disadvantages:
- If one server goes down, the hashing process may have to be recomputed, and users may be redirected to a different server, potentially losing session data.

4. Weighted Round Robin

How It Works: Similar to Round Robin, but weights are assigned to each server based on their capacity. For instance, if Server 1 can handle twice the load of Server 2, Server 1 might get two requests for every one that Server 2 receives.
Advantages:
- More Control Over Load Distribution: Servers with higher weights get more traffic, matching their capacity.
- Flexible for environments where servers have unequal resources.
Disadvantages:
- Requires careful tuning of weights, which can change over time if server capacities are adjusted.

5. Weighted Least Connections

How It Works: Combines the Least Connections and Weighted strategies. Requests are sent to the server with the fewest active connections, but servers with higher weights are given preference.
Advantages:
- Balances load more accurately by considering both server capacity and current load.
- Useful for applications with dynamic resource needs where each connection has different demands.
Disadvantages:
- More complex to implement and can introduce a bit more processing overhead.

6. Random

How It Works: Assigns each request to a server at random. This is typically used in scenarios where all servers have equal capacity and there’s no need for complex balancing.
Advantages:
- Simple to implement and distribute load across all servers.
- Avoids certain biases that other algorithms might introduce.
Disadvantages:
- Doesn’t consider server load or connections, so some servers might be overloaded, especially in cases of varying server performance.

7. Source IP Affinity (Sticky Sessions)

How It Works: Similar to IP Hashing but focuses on ensuring that each client’s session remains with a specific server. Useful when a user session must stay on one server for continuity.
Advantages:
- Maintains session consistency for users, which is crucial for applications like shopping carts or user dashboards.
Disadvantages:
- If one server fails, session data may be lost if not properly managed with session replication across servers.

8. Geographic (Geo-Location Based) Load Balancing

How It Works: Routes requests based on the client’s geographic location. Often used for content delivery networks (CDNs) and global applications.
Advantages:
- Improves latency by directing users to the nearest server, reducing data travel time.
- Useful for regional regulations and data sovereignty.
Disadvantages:
- Requires more complex infrastructure and possibly multiple data centers across regions.

9. Adaptive Algorithms

How It Works: Uses real-time metrics like server load, CPU usage, or response time to make decisions. These algorithms can adjust based on the actual performance of each server.
Advantages:
- Highly responsive to changing loads and server states, adapting in real-time to ensure optimal performance.
Disadvantages:
- Resource-intensive to monitor and analyze metrics constantly.
- Requires robust monitoring and metrics systems.

Choosing the Right Algorithm

If traffic is steady and servers are identical, Round Robin or Random can be a simple, effective choice.
For more variable or state-sensitive applications, IP Hashing or Source IP Affinity can help ensure session persistence.
In environments with unequal servers or fluctuating loads, Weighted Round Robin or Adaptive Load Balancing may be best.
Geographic Load Balancing is ideal for global applications where latency and data regulations matter.

Each algorithm comes with trade-offs in performance, complexity, and server utilization. For example, Weighted Least Connections is effective but more complex, while Round Robin is simple but less precise in load handling.

Let's go through examples for each load balancing algorithm to see how they would work in real-world scenarios. 🚀

1. Round Robin

Example: Imagine a website with three servers (A, B, and C) serving traffic. In Round Robin:
- The first request goes to Server A.
- The second goes to Server B.
- The third goes to Server C.
- It then repeats, cycling through A, B, and C, no matter the load each server has.
Real-World Use: Used for simple websites or applications where each request has about the same workload, like static content delivery.

2. Least Connections

Example: Suppose a video streaming service has three servers, but some users stream high-resolution videos while others stream lower-resolution ones. Some connections are more resource-intensive.
- If Server A has 5 active connections, Server B has 3, and Server C has 2, a new user will be directed to Server C (fewest active connections).
Real-World Use: Good for video streaming sites or online gaming platforms, where different requests use different amounts of resources.

3. IP Hashing

Example: A user logs into a shopping website from their home IP address. With IP Hashing:
- The load balancer hashes the user’s IP to always direct them to Server B.
- If the user returns to the site later, they’ll still be routed to Server B, maintaining their session data.
Real-World Use: Useful in e-commerce or banking, where users need consistent server sessions for things like shopping carts or account data.

4. Weighted Round Robin

Example: Say there are three servers: Server A (high capacity), Server B (medium), and Server C (low). Server A gets a weight of 3, B gets 2, and C gets 1. The load balancer will send:
- Three requests to Server A,
- Two requests to Server B,
- One request to Server C,
- Then it repeats this pattern.
Real-World Use: Great for websites with a mix of powerful and less powerful servers, where load should be distributed based on capacity.

5. Weighted Least Connections

Example: Consider a messaging app with three servers of varying power levels. Server A (high capacity) has a weight of 3, Server B has 2, and Server C has 1.
- If Server A has 4 connections, B has 3, and C has 2, new requests go to the server with fewer connections but based on their weight.
Real-World Use: Perfect for applications where server resources are highly variable, such as large SaaS platforms or real-time gaming backends.

6. Random

Example: A news website has three identical servers, and the load balancer assigns requests at random.
- Each request randomly hits Server A, B, or C without considering active load or connections.
Real-World Use: Works well for simple applications with identical servers and non-session-dependent traffic, like news or blog sites.

7. Source IP Affinity (Sticky Sessions)

Example: For a social media platform, users need to stay on the same server for the duration of their session. When a user logs in:
- The load balancer assigns their IP to Server B, and they stay on that server until they log out or their session times out.
Real-World Use: Ideal for applications needing session consistency, like social media, banking, or any web app where session continuity is important.

8. Geographic (Geo-Location Based) Load Balancing

Example: A global e-commerce website has data centers in the US, Europe, and Asia. With geo-location:
- Users in the US are routed to the US data center.
- European users are sent to the European servers.
- Asian users connect to the data center in Asia, improving latency and response time.
Real-World Use: Essential for global applications, like e-commerce or streaming platforms, where users expect fast responses and compliance with data regulations.

9. Adaptive Algorithms

Example: An online gaming platform uses adaptive load balancing to monitor server health in real-time. Servers with lower CPU and memory usage are given more traffic, while heavily loaded servers receive fewer requests.
- If Server B’s CPU usage spikes due to an increase in active players, the load balancer will dynamically reduce the load on that server, redistributing to other servers.
Real-World Use: Used by complex, dynamic applications like gaming, financial trading platforms, or machine learning services where server performance varies constantly.

Summary Table

Algorithm	Real-World Use Case	Key Feature
Round Robin	Simple websites, static content delivery	Cyclical, equal distribution
Least Connections	Streaming, gaming	Balances by active connections
IP Hashing	E-commerce, banking	Session consistency by IP
Weighted Round Robin	Sites with mixed server capacities	Balances by server capacity
Weighted Least Connections	SaaS platforms, gaming backends	Capacity + active connection balancing
Random	Blogs, news sites with identical servers	Simple random distribution
Source IP Affinity	Social media, user dashboards	Keeps user on same server
Geographic	Global services like e-commerce, streaming	Routes by location
Adaptive	Gaming, trading platforms, machine learning	Real-time server performance

Each of these algorithms supports specific needs, helping maintain scalability, reliability, and efficient performance across servers in different environments. 🎉

More about Least connection

Question - in the least connection, how it is good for video streaming sites 

means if server A has 5 active connections 5 people are sending requests to server A right?

Server B has 3 and server C has 2, but what if Server A's all 5 are streaming video on 360Ps and Server C's 2 users at 1080p

Let's clarify this: in Least Connections, the number of active connections is the main factor considered, not the intensity of each connection's load. But you’re absolutely right—this can be a limitation for video streaming if we don't account for the quality or bitrate each user is streaming.

Here’s how Least Connections can help and where it might fall short in streaming applications:

How Least Connections Helps in Video Streaming:

In a basic scenario, Least Connections is effective because it balances users by distributing them to servers with fewer current users. This prevents overloading any single server.
If all users are streaming at about the same bitrate or quality (e.g., all on 480p), this works well, as each server gets a fairly even load.

Challenge with High-Quality Video Streams:

As you pointed out, a 1080p stream consumes more resources (CPU, memory, bandwidth) than a 360p or 480p stream.
Least Connections doesn’t account for the content of each request; it just sees the number of connections. So if Server C has two users both streaming at 1080p, its resources might be maxed out even if Server A, with five 360p users, still has capacity.

Solutions to This Challenge:

Least Load (Resource-Aware) Balancing:
- This is an enhanced version of Least Connections where the load balancer checks not only the number of active connections but also CPU and memory usage.
- If Server C’s two users are consuming more resources (due to 1080p), the load balancer would be able to direct new users elsewhere, balancing based on server capacity rather than just connection count.
Adaptive Algorithms:
- Many streaming services use adaptive load balancers that monitor server health in real-time, adjusting traffic dynamically based on current CPU, memory, and bandwidth usage.
- These load balancers would identify that Server C’s resource usage is high and route new users to a less-loaded server.
Bitrate-Aware Load Balancing:
- In some advanced streaming setups, the load balancer even assesses the bitrate or quality of each video stream. This way, the load balancer knows that two 1080p streams use more resources than five 360p streams and can adjust accordingly.

Example Scenario with Resource-Aware Load Balancer:

Suppose Server A has five 360p streams, Server B has three 480p streams, and Server C has two 1080p streams.
Instead of simply counting connections, the load balancer observes resource metrics: Server A’s load might still be light despite five connections, while Server C is near capacity.
New users would then be directed to Server A or B, depending on which has more available resources.

In short, Least Connections is a good start for basic load balancing but can be limiting for resource-heavy services like video streaming at variable bitrates. That’s why adaptive, resource-aware algorithms are often better for streaming platforms—they balance the actual load rather than just the connection count! 🚀

Example of Weighted Least Connection

Let’s break down Weighted Least Connections with a few examples to show how it works in load balancing for applications where servers have different capacities.

What is Weighted Least Connections?

Weighted Least Connections is an extension of the Least Connections algorithm, where each server is assigned a weight based on its capacity. Servers with higher weights are more powerful (able to handle more connections), while servers with lower weights are less powerful.

The load balancer considers both the number of active connections and the weight of each server to decide where to send a new connection. This way, powerful servers take on more connections, while smaller servers handle fewer.

Example 1: Video Streaming Service

Imagine a video streaming service with three servers:

Server A: High capacity, weight of 3
Server B: Medium capacity, weight of 2
Server C: Low capacity, weight of 1

Current Load:

Server A has 6 active connections
Server B has 4 active connections
Server C has 2 active connections

How Weighted Least Connections Distributes the Next Request:

The load balancer calculates effective connections by dividing the active connections by each server’s weight:
- Server A: 6 connections / weight 3 = 2 effective connections
- Server B: 4 connections / weight 2 = 2 effective connections
- Server C: 2 connections / weight 1 = 2 effective connections
Here, all servers appear equally loaded based on effective connections.
Let’s say a new connection comes in. The algorithm will choose Server A or B (higher-capacity servers) to avoid overwhelming Server C, which has a lower weight and lower capacity.

Example 2: E-commerce Site with Variable Traffic

Now imagine an e-commerce website during a sale event, with three servers:

Server X: Very high capacity, weight 4 (like a high-performance server)
Server Y: Moderate capacity, weight 2
Server Z: Low capacity, weight 1

Current Load:

Server X has 12 active connections
Server Y has 5 active connections
Server Z has 3 active connections

Effective Connections Calculation:

Server X: 12 connections / weight 4 = 3 effective connections
Server Y: 5 connections / weight 2 = 2.5 effective connections
Server Z: 3 connections / weight 1 = 3 effective connections

In this case:

Server Y has the fewest effective connections, so it will receive the next incoming connection.

Why Weighted Least Connections is Effective

This method is beneficial because it considers both the number of connections and the server’s capacity, making it ideal when:

Different servers have varying performance capabilities.
Applications (like e-commerce or streaming) need to balance load proportionally to server power.

Summary Table

Server	Weight	Active Connections	Effective Connections (Active / Weight)
Server A	3	6	2
Server B	2	4	2
Server C	1	2	2

In scenarios with complex, dynamic workloads and varying server capacities, Weighted Least Connections provides an effective way to keep everything running smoothly. 🚀

Top comments (1)

Jaimin Bariya • Nov 7

DEV Community

Load Balancing Algorithms with Examples

1. Round Robin

2. Least Connections

3. IP Hashing

4. Weighted Round Robin

5. Weighted Least Connections

6. Random

7. Source IP Affinity (Sticky Sessions)

8. Geographic (Geo-Location Based) Load Balancing

9. Adaptive Algorithms

Choosing the Right Algorithm

1. Round Robin

2. Least Connections

3. IP Hashing

4. Weighted Round Robin

5. Weighted Least Connections

6. Random

7. Source IP Affinity (Sticky Sessions)

8. Geographic (Geo-Location Based) Load Balancing

9. Adaptive Algorithms

Summary Table

More about Least connection

How Least Connections Helps in Video Streaming:

Challenge with High-Quality Video Streams:

Solutions to This Challenge:

Example Scenario with Resource-Aware Load Balancer:

Example of Weighted Least Connection

What is Weighted Least Connections?

Example 1: Video Streaming Service

Example 2: E-commerce Site with Variable Traffic

Why Weighted Least Connections is Effective

Summary Table

Top comments (1)

Read next

EFS for Centralized Shared Storage

AGI, Are We There Yet?

JavaScript Object - Shallow freeze vs Deep freeze

Efficient Scaling: An Introduction to Virtual Machine Scale Sets (VMSS