<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Anjusha Felix</title>
    <description>The latest articles on DEV Community by Anjusha Felix (@buildwithanjusha).</description>
    <link>https://dev.to/buildwithanjusha</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3917509%2F03222939-91dc-4a9f-bd2c-117d80fd17d9.png</url>
      <title>DEV Community: Anjusha Felix</title>
      <link>https://dev.to/buildwithanjusha</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/buildwithanjusha"/>
    <language>en</language>
    <item>
      <title>Load Balancing Algorithms Explained 🚦⚖️</title>
      <dc:creator>Anjusha Felix</dc:creator>
      <pubDate>Thu, 07 May 2026 09:57:27 +0000</pubDate>
      <link>https://dev.to/buildwithanjusha/load-balancing-algorithms-explained-3okc</link>
      <guid>https://dev.to/buildwithanjusha/load-balancing-algorithms-explained-3okc</guid>
      <description>&lt;p&gt;If you’re building scalable systems, understanding load balancing is essential.&lt;/p&gt;

&lt;p&gt;A Load Balancer distributes incoming traffic across multiple servers to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;improve performance&lt;/li&gt;
&lt;li&gt;prevent server overload&lt;/li&gt;
&lt;li&gt;increase availability&lt;/li&gt;
&lt;li&gt;handle scaling efficiently&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Here are the most common load balancing algorithms every backend/system design engineer should know 👇&lt;/p&gt;

&lt;h2&gt;
  
  
  🔄 Round Robin Load Balancing
&lt;/h2&gt;

&lt;p&gt;Round Robin is the simplest and most commonly used load balancing algorithm.&lt;/p&gt;

&lt;p&gt;It distributes incoming requests &lt;strong&gt;sequentially&lt;/strong&gt; across servers in a circular order.&lt;/p&gt;

&lt;p&gt;Imagine you have 3 backend servers:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Server A
Server B
Server C
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A load balancer sits in front of them. When users send requests:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Request 1 → Server A
Request 2 → Server B
Request 3 → Server C
Request 4 → Server A
Request 5 → Server B
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The cycle repeats continuously. That’s why it’s called Round Robin.&lt;/p&gt;

&lt;h2&gt;
  
  
  🔄 Weighted Round Robin Load Balancing
&lt;/h2&gt;

&lt;p&gt;Weighted Round Robin is an improved version of the Round Robin load balancing algorithm where traffic is distributed based on the capacity of each server instead of sending equal requests to all servers.&lt;/p&gt;

&lt;p&gt;In simple Round Robin:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;A → B → C → repeat
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Every server gets the same number of requests. &lt;/p&gt;

&lt;p&gt;But in real-world systems, servers are often NOT equal.&lt;/p&gt;

&lt;p&gt;Some servers:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;have more CPU&lt;/li&gt;
&lt;li&gt;more RAM&lt;/li&gt;
&lt;li&gt;better processing power&lt;/li&gt;
&lt;li&gt;higher network bandwidth&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So sending equal traffic becomes inefficient. &lt;/p&gt;

&lt;p&gt;That’s where Weighted Round Robin helps.&lt;/p&gt;

&lt;p&gt;Each server is assigned a weight based on its capability.&lt;/p&gt;

&lt;p&gt;Example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Server A → Weight 5
Server B → Weight 3
Server C → Weight 1
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This means:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Server A receives most requests&lt;/li&gt;
&lt;li&gt;Server B receives moderate requests&lt;/li&gt;
&lt;li&gt;Server C receives fewer requests&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Traffic distribution becomes proportional to server strength.&lt;/p&gt;

&lt;h2&gt;
  
  
  🔗 Least Connections Load Balancing
&lt;/h2&gt;

&lt;p&gt;Least Connections is a dynamic load balancing algorithm that sends incoming requests to the server with the fewest active connections.&lt;/p&gt;

&lt;p&gt;Unlike Round Robin or Weighted Round Robin, it does NOT distribute traffic in fixed order.&lt;/p&gt;

&lt;p&gt;Instead, it continuously checks:&lt;/p&gt;

&lt;p&gt;Which server is currently least busy?&lt;/p&gt;

&lt;p&gt;Then routes the next request there.&lt;/p&gt;

&lt;p&gt;Suppose you have 3 servers:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Server A → 120 active connections
Server B → 40 active connections
Server C → 15 active connections
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The next request goes to:&lt;/p&gt;

&lt;p&gt;&lt;code&gt;Server C&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;Because it currently has the lowest load.&lt;/p&gt;

&lt;h4&gt;
  
  
  ⚙️ How the Algorithm Works Internally
&lt;/h4&gt;

&lt;p&gt;The load balancer maintains a real-time connection counter.&lt;/p&gt;

&lt;p&gt;Example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Server A = 80 connections
Server B = 22 connections
Server C = 5 connections
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Incoming request:&lt;/p&gt;

&lt;p&gt;→ Assign to Server C&lt;/p&gt;

&lt;p&gt;After assignment:&lt;/p&gt;

&lt;p&gt;Server C = 6 connections&lt;/p&gt;

&lt;p&gt;This process repeats continuously.&lt;/p&gt;

&lt;h2&gt;
  
  
  🌐 IP Hash Load Balancing
&lt;/h2&gt;

&lt;p&gt;IP Hash is a load balancing algorithm where the client’s IP address is used to decide which server will handle the request.&lt;/p&gt;

&lt;p&gt;Instead of distributing requests randomly or sequentially, the load balancer applies a hashing function on the client IP address.&lt;/p&gt;

&lt;p&gt;Example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Hash(192.168.1.10) → Server B
Hash(192.168.1.11) → Server A
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This means the same user usually gets routed to the same server every time.&lt;/p&gt;

&lt;p&gt;Suppose your infrastructure has:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Server A
Server B
Server C
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When User 1 sends a request:&lt;/p&gt;

&lt;p&gt;&lt;code&gt;IP = 101.23.45.10&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;The load balancer calculates:&lt;/p&gt;

&lt;p&gt;&lt;code&gt;Hash(IP) % Number of Servers&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;Result:&lt;/p&gt;

&lt;p&gt;→ Server B&lt;/p&gt;

&lt;p&gt;Whenever the same user sends another request:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;login&lt;/li&gt;
&lt;li&gt;add to cart&lt;/li&gt;
&lt;li&gt;payment&lt;/li&gt;
&lt;li&gt;profile access&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;they continue reaching:&lt;/p&gt;

&lt;p&gt;&lt;code&gt;Server B&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;This creates session consistency.&lt;/p&gt;

&lt;h2&gt;
  
  
  ⚡ Least Response Time Load Balancing
&lt;/h2&gt;

&lt;p&gt;Least Response Time is an intelligent load balancing algorithm that routes incoming requests to the server responding the fastest.&lt;/p&gt;

&lt;p&gt;Unlike Round Robin, which distributes requests equally, Least Response Time continuously monitors server performance in real time.&lt;/p&gt;

&lt;p&gt;It checks:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;response latency&lt;/li&gt;
&lt;li&gt;server speed&lt;/li&gt;
&lt;li&gt;sometimes active connections&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Then forwards traffic to the best-performing server.&lt;/p&gt;

&lt;p&gt;Suppose you have 3 servers:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Server&lt;/th&gt;
&lt;th&gt;Average Response Time&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Server A&lt;/td&gt;
&lt;td&gt;250ms&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Server B&lt;/td&gt;
&lt;td&gt;90ms&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Server c&lt;/td&gt;
&lt;td&gt;40ms&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The next request goes to:&lt;/p&gt;

&lt;p&gt;&lt;code&gt;Server C&lt;br&gt;
&lt;/code&gt;&lt;br&gt;
because it is responding the fastest.&lt;/p&gt;

&lt;h2&gt;
  
  
  🧠 Adaptive Load Balancing
&lt;/h2&gt;

&lt;p&gt;Adaptive Load Balancing is an advanced and intelligent load balancing technique where traffic distribution changes dynamically based on real-time server conditions.&lt;/p&gt;

&lt;p&gt;Unlike traditional algorithms such as:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Round Robin&lt;/li&gt;
&lt;li&gt;Weighted Round Robin&lt;/li&gt;
&lt;li&gt;Least Connections&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Adaptive Load Balancing continuously monitors the health and performance of the infrastructure before routing requests.&lt;/p&gt;

&lt;p&gt;It considers factors like:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;CPU usage&lt;/li&gt;
&lt;li&gt;memory consumption&lt;/li&gt;
&lt;li&gt;response latency&lt;/li&gt;
&lt;li&gt;active connections&lt;/li&gt;
&lt;li&gt;server health&lt;/li&gt;
&lt;li&gt;network traffic&lt;/li&gt;
&lt;li&gt;request failure rate&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Then automatically decides:&lt;/p&gt;

&lt;p&gt;Which server can currently handle traffic most efficiently?&lt;/p&gt;

&lt;p&gt;Suppose you have 3 servers:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Server&lt;/th&gt;
&lt;th&gt;CPU Usage&lt;/th&gt;
&lt;th&gt;Response Time&lt;/th&gt;
&lt;th&gt;Health&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Server A&lt;/td&gt;
&lt;td&gt;90%&lt;/td&gt;
&lt;td&gt;300ms&lt;/td&gt;
&lt;td&gt;Healthy&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Server B&lt;/td&gt;
&lt;td&gt;45%&lt;/td&gt;
&lt;td&gt;70ms&lt;/td&gt;
&lt;td&gt;Healthy&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Server c&lt;/td&gt;
&lt;td&gt;20%&lt;/td&gt;
&lt;td&gt;40ms&lt;/td&gt;
&lt;td&gt;Healthy&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Even if Server A is very powerful, it is currently overloaded.&lt;/p&gt;

&lt;p&gt;Adaptive Load Balancer automatically routes new traffic to:&lt;/p&gt;

&lt;p&gt;Server C&lt;/p&gt;

&lt;p&gt;because it has:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;low CPU usage&lt;/li&gt;
&lt;li&gt;faster response time&lt;/li&gt;
&lt;li&gt;better availability&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>systemdesign</category>
      <category>architecture</category>
      <category>learning</category>
      <category>microservices</category>
    </item>
  </channel>
</rss>
