High Performance LoadBalancing Using NGINX

In this article, we will be looking at how we can configure NGINX for load balancing and also some of the Health Checks that NGINX provides. I assume you all have some knowledge about NGINX and its basic configurations.

So, in today’s world, all internet users demand high performance and high availability from the applications that they use. To achieve this, multiple server instances can be spin up, and the traffic or the load can be distributed among them. Simply, this is what we called load-balancing and this architectural technique is also called Horizontal Scaling. The end user’s experience will be absolutely positive if the architecture is properly load-balanced.

But wait ✋ If it is Stateful? 🤔
Most of the modern applications follow a stateless approach while storing the state in a shared memory or a database. But it is not the reality for all the applications. Because for some types of applications, session state is really useful, and it plays a crucial role in the performance of the application. The state of these stateful applications might be stored locally for a number of reasons. Assume if an application gets a huge load of traffic and it has to access a shared state over the network. Eventually, the network overhead will increase exponentially, and it will make a negative impact on the application’s performance as well as the user experience 😫. When a user’s state is saved locally on an application server, it’s critical for the user experience that subsequent requests are sent to the same server. Another aspect of this stateful architecture is that servers should not be released until the user’s session is over. NGINX is an intelligent load balancer that offers multiple ways to balance the load in stateful applications by tracking cookies or routing.

Upstream

upstream loadbalancer {
server app1.example.com:8080;
server app2.example.com:8080;
server 172.31.58.179:8080;
}

Simply, upstream is a pool of server instances. NGINX will use this pool of servers to distribute the load among those servers. In this example, we have defined an upstream called loadbalancer and inside that loadbalancer we have defined our server pool which consists of 3 servers. Each upstream destination is defined in the pool using the server directive. These destinations can be IP addresses, DNS records, Unix sockets, or even a mix of all of them.

NGINX (open-source version) and NGINX Plus support a number of load balancing algorithms including Round Robin, Least Connections, Least Time, Generic Hash, IP Hash, and Random.

Round Robin

upstream loadbalancer {
server app1.example.com;
server app2.example.com;
server app3.example.com;
}
server {
listen 80;
location / {
proxy_pass http://loadbalancer;
}
}

The Round Robin algorithm will be the default and the simplest one out of all the other algorithms. Here you can see we have defined our upstream at the same level with the server context. The server will be listening on port 80 and it will forward all the requests to our upstream called loadbalancer. By default, each new client request will be proxied to the next neighboring server destination in a rotating sequential manner. Therefore, traffic will be equally distributed among all 3 servers in the upstream.

We can also use some optional parameters to take more control over the routing of the requests.

upstream loadbalancer {
server app1.example.com weight=1;
server app2.example.com weight=2;
server app3.example.com backup;
}

Here, with the weight parameter we have instructed NGINX to forward twice as much traffic to the second server as it does to the first. The default value of this weight parameter is 1. In that case, traffic will be distributed equally among all the servers in the upstream. In Round Robin, the greater the weight, the more favored the server will be. 😃

We have configured the third server as our backup destination using the backup parameter. The backup server or the 3rd server will be used when the other primary two servers are unavailable.

Least Connections

upstream loadbalancer {
least_conn;
server app1.example.com;
server app2.example.com;
server app3.example.com;
}

Here, we need to mention the specific name or the directive of the algorithm that we gonna use inside our upstream context except in Round Robin. Because RR is the default one. Here, the directive name is least_conn. In this algorithm, client requests will be proxied to the server with the least number of active connections. We can take the weight parameter into consideration with this algorithm as well.

Least Time

upstream loadbalancer {
least_time header;
server app1.example.com;
server app2.example.com;
server app3.example.com;
}

This is one of the most sophisticated load balancing algorithms and only available in the NGINX Plus version. The directive name is least_time. In this algorithm, NGINX Plus will select the server from the pool which has the least number of active connections and the lowest average response time. The lowest response time can be calculated using two parameters called headerand last_byte. Either header or last_byte must be specified with the algorithm’s name directive. When header is specified, the time to receive the first byte of the response will be used. When last_byte specified, time to receive the full response will be used.

Generic Hash

upstream loadbalancer {
hash $request_uri consistent;
server app1.example.com;
server app2.example.com;
server app3.example.com;
}

Here, the directive name is hash and it has an optional parameter called consistent. In this algorithm, a user-defined key determines the server to which the request is going to be forwarded. It can be either text String or run time request variable or both. In this example, we have used request URI as our key. When we need more control over where requests are sent or want to figure out which upstream server is most likely to have the data cached, this approach comes in handy 😃. Based on the user-defined hash key value, requests are distributed evenly among all upstream servers. But what if we add or remove a server from the pool? 🤔 Well, in that case, hashed requests will be redistributed. To handle such a situation, the consistent optional parameter can be used to minimize the effect of redistribution.

IP Hash

upstream loadbalancer {
ip_hash;
server app1.example.com;
server app2.example.com;
server app3.example.com;
}

Only the HTTP protocol is supported with this algorithm and its directive name is ip_hash. This algorithm is slightly different from the Generic Hash because, instead of using remote variables, IP Hash uses the client’s IP address to calculate the hash value. It will be calculated using either the first three octets of the IPv4 address or the entire IPv6 address. This algorithm ensures that a specific client’s requests are proxied to the same upstream server throughout the client’s session as long as that specific server is available. Therefore, we can apply this algorithm in stateful architectures, where the session state is of concern, and it is not stored in a shared memory. We can take the weight parameter into consideration with the IP Hash algorithm as well.

Random

upstream loadbalancer {
random two least_time=last_byte;
server app1.example.com;
server app2.example.com;
server app3.example.com;
}

This algorithm will pick a server from the upstream pool randomly and forward the client requests to it. Directive name is random. It has an optional parameter called two and it will instruct NGINX to randomly pick 2 servers from the pool. Then the load will be balanced between those 2 servers according to the method that we specified. In this case, it’s least_time, but the default method is least_conn. We can use the weight parameter with this algorithm as well.

Health Checks
Upstream requests of the load balancer may fail due to a variety of reasons. Like server failures, application failures, network connectivity problems, and so on. A load balancer should be able to detect such upstream failures and avoid distributing the client requests to those servers. Otherwise, the client may get nothing but a request timeout response. For this reason, NGINX provides 2 types of health checks as Active health checks (only available in NGINX Plus) and Passive health checks.

Active Health checks will send requests to the upstream server at a regular time interval and then the response of the server will be used to verify the status of the server. But in Passive Health checks, NGINX will monitor the responses of the upstream server as the client makes the request or the connection.

Now you might think active health checks will increase the load on the upstream servers. Therefore, it’s better to use passive health checks. Or you might want to use active health checks to detect upstream failures at early stages before a user gets a failed response. So, actually active and passive both health checks are important in their own way. Active health checks will generate predictive data in order to warn about potential failures of the system. Passive health checks employ real-time performance statistics to show you what the end-user actually sees. The best way to monitor the health status of upstream is to use a combination of both. In this article, I’m not going to dig deep into health checks, if you are interested you can learn more about it from the 👉 https://docs.nginx.com/nginx/admin-guide/load-balancer/http-health-check/ official documentation.

Well, in this article we have taken a deep dive into some load balancing techniques and their basic configurations. You can find more information about this topic from the 👉 NGINX official documentation https://docs.nginx.com/nginx/admin-guide/load-balancer/http-health-check/ as well. So, I hope you gained something valuable from this article, and thanks for reading till the end.

CHEERS!!!

Reference: https://medium.com/tech-it-out/high-performance-load-balancing-with-nginx-b3a31acd88a3