Discussion on: Explaining Load Balancers

View post

Hi, thanks for the post! Some Qs, if you don't mind:

If all traffic goes through the load balancer, then can it, itself get overloaded? Does the load balancer have to scale vertically? What does Google do, for example?

The session thing confused me, I thought the session cookie was based on the host and I figured all the nodes would have the same host. If session data is then looked up, then I assume it would be on a shared resource. If the request needs to be handled by the same machine then it seems like that would prohibit swapping out the nodes, eg, for upgrades or w/e.

Does the load balancer add latency?

Does the load balancer look like any other server externally or is it sitting somehow outside the normal request cycle? Eg it feels like maybe it operates at the DNS level.

Is the load balancer a single point of failure?

Slavius • Aug 24 '18 • Edited

Hi,

Is the load balancer a single point of failure?

If implemented incorrectly, yes it can become SPOF. Please read on to find how to mitigate this problem.

If all traffic goes through the load balancer, then can it, itself get overloaded? Does the load balancer have to scale vertically? What does Google do, for example?

Of course. Although compared to application server that processes all-purpose application code - a load balancer has limited features, knows almost fully the full domain of its responsibilities and for this purpose contains acceleration chips to help with individual tasks (network processing, SSL/TLS encryption, data compression).

The session thing confused me, I thought the session cookie was based on the host ...

When you visit a web application login form you already get a cookie even though you're not yet authenticated. This is for the application server to be able to match your request with your next form submission, e.g. to match server-side generated captcha with your response in the next POST request. Session cookies are application generated, in some cases handled by the application server itself (like Tomcat has context.xml configuration file dedicated to session context configuration).
If you'd like all nodes to be session aware, you would have to back your sessions in a shared store - like a database. This however, proved to be very difficult to implement without further problems and it introduces additional latency (every single request sends/requests a cookie, even the ones for resources) and it consumes DB server's resources. It can also overload the DB server creating DoS situation - imagine a botnet creating millions of new session by sending requests as small as few bytes; which usually is:

GET /login HTTP/1.1
Host: servername.tld

Btw, you can try this on your own with telnet. Fire up telnet to a webserver's port 80 and type this in followed by 2 empty lines. Note: the Host: line is not required if the server serves one site per IP!

If the request needs to be handled by the same machine then it seems like that would prohibit swapping out the nodes, eg, for upgrades or w/e.

To do a maintenance you set state of required nodes on the load balancer as inactive (not accepting new requests), you wait until the count of active sessions to these nodes drop to zero and then you're free to do your maintenance. After you're done you simply re-enable them on the load balancer. Of course if the node suddenly dies it will take a while for the load balancer to realize this and disable new connections to it and of course all active sessions will be handled by different node resulting in requirement for clients to re-login (if the session is not kept in database).

Does the load balancer add latency

Yes and no. It adds latency in terms of additional point of processing, however if it manages to offload tasks from application servers on backend nodes, serves some items from cache, accelerates SSL/TLS session initialization or just serve a request to a node that processes it quicker because it has the lowest load - in the end load balancer improves the overall latency - given you have done everything right. Misconfigured load balancer usually does the opposite.

Does the load balancer look like any other server externally or is it sitting somehow outside the normal request cycle? Eg it feels like maybe it operates at the DNS level.

Load balancer usually sits either in the perimeter of application servers but in it's own isolated network (DMZ). It is then isolated by a firewall.
It can also be placed in the same network as all the nodes but it should use a dedicated network card to communicate with them (for performance reasons one NIC is used for data from outside and the other to communicate with nodes).
Load balancer can work on multiple ISO/OSI layers. The most simple is at the TCP/IP where it has access to IP + port + connection state information only. This is very dumb kind of balancing as you don't understand higher level protocols and your only way of finding if the node is up is to do TCP three-way handshake. It is mostly used with SMTP/POP/IMAP protocols but it is very fast. Often implemented in haproxy.
Then you may balance on HTTP/HTTPS level. Here you understand what's going on and if you also terminate the HTTPS you can read the contents of individual streams. This allows you to compress responses or send cached items. It also allows you to do routing and limiting and session awarness.
These are often called (web application) proxies and not load balancers.
Examples may be: Nginx, IIS with ARR, Apache.

Is the load balancer a single point of failure?

Sure it is when operated alone. Usual highly available setup includes 2 or more load balancers running in cluster in either active/active or active/passive configuration. To further increase the availability you can have 2 different Internet Service Providers (or geo distributed datacenters) each running a pair of clustered load balancers. Then you configure DNS A record resolving to 2 distinct public IP addresses which guarantees round-robin processing splitting DNS requests evenly (CloudFlare is very fast and reliable at this). There's also possibility to return IP address of datacenter closest to your originating geo location by using something like PowerDNS dnsdist
This is what big players do to make their services highly available.

Josh Cheek • Aug 25 '18

Of course. Although compared to application server that processes all-purpose application code - a load balancer has limited features, knows almost fully the full domain of its responsibilities and for this purpose contains acceleration chips to help with individual tasks (network processing, SSL/TLS encryption, data compression).

Nice.

To do a maintenance you set state of required nodes on the load balancer as inactive (not accepting new requests), you wait until the count of active sessions to these nodes drop to zero and then you're free to do your maintenance.

I guess it feels like it's at odds with the bullet point that begins "keeps track of sessions"

Then you configure DNS A record resolving to 2 distinct public IP addresses which guarantees round-robin processing splitting DNS requests evenly (CloudFlare is very fast and reliable at this). There's also possibility to return IP address of datacenter closest to your originating geo location by using something like PowerDNS dnsdist
This is what big players do to make their services highly available.

Ahh, nice, that's what I was missing!

Followup Q: Does the load balancer somehow pass the socket on to the node it's chosen to handle the request (some IO syscall, presumably) or does it return a redirect to tell the client which node to talk to?

Slavius • Aug 25 '18

Q: Does the load balancer somehow pass the socket on to the node it's chosen to handle the request (some IO syscall, presumably) or does it return a redirect to tell the client which node to talk to?

The load balancer handles establishing full session towards the client and at the same time a session towards the node. So basically it has to maintain 2 sockets for each connection. It has to when it wants to alter the conenction, like handle SSL/TLS towards the client and HTTP towards the nodes or HTTP/2 towards clients and HTTP/1.1 towards nodes, etc.
For this reason can a load balancer return HTTP 502 or 504 error codes to the client when a node does not respond within preconfigured interval or just it shows a custom error page ("Sorry for the inconvenience, try again later").

Nawinkmr • Jul 16 '19

Hi Slavius,
Nice explanation and of course re-explanation. I am a bit confused how does it form a HTTP request to the nodes. In this case, I assume that the load balancer receives the https request from client, resolves the SSL/TSL and then send the HTTP request to port 80. In this HTTP packet, what does it send the source IP and port to the node(s). Does it propagate the IP+Port of the client to the nodes or hide them at its own level?
If hides, is there any way to let the nodes know the identity of original requester.
~Nawin

Slavius • Aug 9 '19

Hi Nawinkmr,

there is no official HTTP protocol extension to send this information to the nodes, however a very common way is to add new HTTP headers like X-Forwarded-For, X-Forwarded-Host, X-Forwarded-Proto, X-Real-IP and X-Client-IP as this information is very often vital on the nodes. Nodes then have to understand this on an application level. More in Nginx resources here: nginx.com/resources/wiki/start/top...