<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: JosephAkayesi</title>
    <description>The latest articles on DEV Community by JosephAkayesi (@josephakayesi).</description>
    <link>https://dev.to/josephakayesi</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F135512%2F9f1a5b8a-5601-4012-8690-bb7435e51a43.jpeg</url>
      <title>DEV Community: JosephAkayesi</title>
      <link>https://dev.to/josephakayesi</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/josephakayesi"/>
    <language>en</language>
    <item>
      <title>Design Consistent Hashing</title>
      <dc:creator>JosephAkayesi</dc:creator>
      <pubDate>Wed, 18 Mar 2026 22:09:44 +0000</pubDate>
      <link>https://dev.to/josephakayesi/design-consistent-hashing-3ap4</link>
      <guid>https://dev.to/josephakayesi/design-consistent-hashing-3ap4</guid>
      <description>&lt;h1&gt;
  
  
  Consistent Hashing — System Design Deep Dive
&lt;/h1&gt;

&lt;p&gt;A &lt;strong&gt;consistent hashing&lt;/strong&gt; algorithm is a technique that allows distributed systems to evenly distribute requests and data across a cluster of servers.&lt;/p&gt;

&lt;p&gt;At scale, consistent hashing becomes a fundamental building block for building reliable and horizontally scalable systems.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why Do We Need Consistent Hashing?
&lt;/h2&gt;

&lt;p&gt;Consistent hashing provides several important benefits:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;✅ Minimizes data redistribution when servers join or leave&lt;/li&gt;
&lt;li&gt;✅ Promotes even distribution of load across servers&lt;/li&gt;
&lt;li&gt;✅ Protects systems from cascading failures during topology changes&lt;/li&gt;
&lt;li&gt;✅ Reduces unnecessary cache misses at scale&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Real-World Examples
&lt;/h2&gt;

&lt;p&gt;Consistent hashing appears everywhere in modern distributed systems:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Amazon DynamoDB&lt;/strong&gt; uses it to partition and replicate data&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Apache Cassandra&lt;/strong&gt; uses it to distribute data across nodes&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Akamai CDN&lt;/strong&gt; uses it to route requests to edge servers&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In reality, the exact implementation depends entirely on your system's access patterns and infrastructure requirements.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Rehashing Problem
&lt;/h2&gt;

&lt;p&gt;Before understanding consistent hashing, we need to understand the problem it solves.&lt;/p&gt;

&lt;p&gt;There are &lt;code&gt;n&lt;/code&gt; servers in your cluster. Each server's index is computed using:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;server_idx&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;hash&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt;

&lt;span class="nb"&gt;hash&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Hash&lt;/span&gt; &lt;span class="n"&gt;function&lt;/span&gt; &lt;span class="n"&gt;to&lt;/span&gt; &lt;span class="nb"&gt;hash&lt;/span&gt; &lt;span class="nb"&gt;all&lt;/span&gt; &lt;span class="n"&gt;keys&lt;/span&gt;
&lt;span class="n"&gt;n&lt;/span&gt;    &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Size&lt;/span&gt; &lt;span class="n"&gt;of&lt;/span&gt; &lt;span class="n"&gt;your&lt;/span&gt; &lt;span class="n"&gt;server&lt;/span&gt; &lt;span class="n"&gt;pool&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpu8rxzjcv5hen4tvakz1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpu8rxzjcv5hen4tvakz1.png" alt=" " width="683" height="472"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;To fetch a key we perform the following operation:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;f(key) % 4
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The output is the server on which that key is stored.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5l9nz387weg9nnevy6nd.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5l9nz387weg9nnevy6nd.png" alt=" " width="681" height="344"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This approach works fine when the server pool is fixed and data is distributed evenly.&lt;/p&gt;

&lt;h3&gt;
  
  
  Problem: Server Changes
&lt;/h3&gt;

&lt;p&gt;When a server is added or removed, nearly all keys must be remapped.&lt;/p&gt;

&lt;p&gt;Below is how server indexes are affected when we remove a server from the pool:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2fjkdq8d0582pxs77ea6.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2fjkdq8d0582pxs77ea6.png" alt=" " width="681" height="447"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The new distribution of keys after the server is removed:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fonk0u8jdycsa9ny1iwy8.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fonk0u8jdycsa9ny1iwy8.png" alt=" " width="676" height="388"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Most keys are redistributed — not only those that were previously stored on the server that went offline.&lt;/p&gt;




&lt;h2&gt;
  
  
  Core Consistent Hashing Concepts
&lt;/h2&gt;

&lt;p&gt;Most consistent hashing implementations are built on a few foundational ideas:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Hash Space and Hash Ring&lt;/li&gt;
&lt;li&gt;Hash Servers&lt;/li&gt;
&lt;li&gt;Hash Keys&lt;/li&gt;
&lt;li&gt;Server Lookup&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Let's walk through each one.&lt;/p&gt;




&lt;h2&gt;
  
  
  Hash Space and Hash Ring
&lt;/h2&gt;

&lt;p&gt;Imagine a hash algorithm whose output goes from &lt;code&gt;x0, x1 ... xn&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;For example, if our hash function is &lt;code&gt;hash(key) % 100&lt;/code&gt;, the hash space runs from &lt;code&gt;0&lt;/code&gt; to &lt;code&gt;99&lt;/code&gt;. In real systems, the space is far larger — SHA-1 runs from &lt;code&gt;0&lt;/code&gt; to &lt;code&gt;2^160 - 1&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;We then connect the two ends of the hash space to form a &lt;strong&gt;hash ring&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fuxet91lwzthki73cbz9y.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fuxet91lwzthki73cbz9y.png" alt=" " width="684" height="451"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Hash Servers
&lt;/h2&gt;

&lt;p&gt;We use our hashing function to map servers onto the ring using either the server name or IP address.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4qpg2g3qge47ynwgeoiz.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4qpg2g3qge47ynwgeoiz.png" alt=" " width="693" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Most real-world systems use a different hashing function for servers than for keys.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  Hash Keys
&lt;/h2&gt;

&lt;p&gt;Keys are hashed and placed onto the same ring.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkoks0nw7jnxopdusmcpn.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkoks0nw7jnxopdusmcpn.png" alt=" " width="697" height="404"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Server Lookup
&lt;/h2&gt;

&lt;p&gt;To determine which server a key belongs to, we move &lt;strong&gt;clockwise&lt;/strong&gt; from the key's position on the ring until we hit the nearest server.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2vhcnhqbs1rf75kc9wac.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2vhcnhqbs1rf75kc9wac.png" alt=" " width="639" height="337"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Adding and Removing Servers
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Add a Server
&lt;/h3&gt;

&lt;p&gt;When a new server is added, only a subset of keys need to be redistributed — those whose clockwise path now reaches the new server first.&lt;/p&gt;

&lt;p&gt;In the example below, only &lt;code&gt;key0&lt;/code&gt; was redistributed. All other keys remain intact.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fi2m1p60zrtnhlmzecqi9.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fi2m1p60zrtnhlmzecqi9.png" alt=" " width="709" height="491"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Remove a Server
&lt;/h3&gt;

&lt;p&gt;When a server is removed, only the keys that were mapped to it are redistributed to the next server clockwise.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnrmg6oqwtkesgsiur9lm.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnrmg6oqwtkesgsiur9lm.png" alt=" " width="743" height="515"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Two Issues in the Basic Approach
&lt;/h2&gt;

&lt;p&gt;The basic consistent hashing approach introduces two problems:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Uneven partition sizes&lt;/strong&gt; — When servers are added or removed, partitions can become imbalanced, leading to uneven load.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1eyvggl4y1y5j5rzrn99.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1eyvggl4y1y5j5rzrn99.png" alt=" " width="714" height="377"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Non-uniform key distribution&lt;/strong&gt; — Keys may cluster on certain servers, leaving others underutilized.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdbes7xfs02ran0fmps3e.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdbes7xfs02ran0fmps3e.png" alt=" " width="709" height="414"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;A technique called &lt;strong&gt;virtual nodes&lt;/strong&gt; is used to solve both of these problems.&lt;/p&gt;




&lt;h2&gt;
  
  
  Virtual Nodes
&lt;/h2&gt;

&lt;p&gt;In addition to real nodes, we place multiple &lt;strong&gt;virtual nodes&lt;/strong&gt; for each server sparsely across the hash ring. Each server is now responsible for multiple partitions.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flyxi9zb10eyk6m25as6l.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flyxi9zb10eyk6m25as6l.png" alt=" " width="723" height="474"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;To find a key's server, we locate the nearest virtual node clockwise from the key's position.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2qssbwdgmebu6vtu69px.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2qssbwdgmebu6vtu69px.png" alt=" " width="717" height="427"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;As the number of virtual nodes increases, key distribution becomes more balanced.&lt;/p&gt;




&lt;h2&gt;
  
  
  Finding Affected Keys
&lt;/h2&gt;

&lt;h3&gt;
  
  
  When a server is added:
&lt;/h3&gt;

&lt;p&gt;Move &lt;strong&gt;anticlockwise&lt;/strong&gt; from the new server's position to the nearest existing server. All keys in that range are redistributed onto the new server.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flk8xskqalq6hc8tr8jel.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flk8xskqalq6hc8tr8jel.png" alt=" " width="722" height="493"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  When a server is removed:
&lt;/h3&gt;

&lt;p&gt;The affected range starts at the removed server and moves &lt;strong&gt;anticlockwise&lt;/strong&gt; until the next server is found. Keys in that range are redistributed to the surviving server.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ff9ypono8lgjkr6eds0lv.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ff9ypono8lgjkr6eds0lv.png" alt=" " width="727" height="463"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Algorithm Comparison
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Algorithm&lt;/th&gt;
&lt;th&gt;Redistribution on Change&lt;/th&gt;
&lt;th&gt;Handles Hotspots&lt;/th&gt;
&lt;th&gt;Complexity&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Simple Modulo Hashing&lt;/td&gt;
&lt;td&gt;All keys remapped&lt;/td&gt;
&lt;td&gt;❌ No&lt;/td&gt;
&lt;td&gt;Low&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Basic Consistent Hashing&lt;/td&gt;
&lt;td&gt;Subset of keys&lt;/td&gt;
&lt;td&gt;⚠️ Partial&lt;/td&gt;
&lt;td&gt;Medium&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Consistent Hashing + Virtual Nodes&lt;/td&gt;
&lt;td&gt;Subset of keys&lt;/td&gt;
&lt;td&gt;✅ Yes&lt;/td&gt;
&lt;td&gt;Medium-High&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  Request Flow
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;Client sends a request with a key&lt;/li&gt;
&lt;li&gt;Key is hashed onto the ring&lt;/li&gt;
&lt;li&gt;System moves clockwise to find the nearest server (or virtual node)&lt;/li&gt;
&lt;li&gt;Request is forwarded to that server&lt;/li&gt;
&lt;li&gt;On topology change, only affected keys are redistributed&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  Consistent Hashing in Distributed Systems
&lt;/h2&gt;

&lt;p&gt;Scaling introduces new challenges even with consistent hashing.&lt;/p&gt;

&lt;h3&gt;
  
  
  Hotspot Problem
&lt;/h3&gt;

&lt;p&gt;Even with virtual nodes, certain keys (e.g. celebrity data in social networks) can generate disproportionate traffic to one server.&lt;/p&gt;

&lt;p&gt;Solutions:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Further subdivide hotspot partitions&lt;/li&gt;
&lt;li&gt;Add dedicated servers for high-traffic keys&lt;/li&gt;
&lt;li&gt;Apply application-level caching in front of the ring&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Replication
&lt;/h3&gt;

&lt;p&gt;For fault tolerance, keys are typically replicated to the next &lt;code&gt;N&lt;/code&gt; servers clockwise on the ring. This ensures data survives individual node failures without a full redistribution.&lt;/p&gt;




&lt;h2&gt;
  
  
  Monitoring and Observability
&lt;/h2&gt;

&lt;p&gt;After deployment, monitoring consistent hashing behavior is critical.&lt;/p&gt;

&lt;p&gt;Track:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Load distribution across nodes&lt;/li&gt;
&lt;li&gt;Redistribution events on topology changes&lt;/li&gt;
&lt;li&gt;Virtual node count and balance metrics&lt;/li&gt;
&lt;li&gt;Replication lag across replicas&lt;/li&gt;
&lt;li&gt;Hotspot frequency and severity&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Consistent hashing is not a "set and forget" mechanism — it requires continuous tuning of virtual node counts and replication factors as your cluster evolves.&lt;/p&gt;




&lt;h2&gt;
  
  
  Final Thoughts
&lt;/h2&gt;

&lt;p&gt;Consistent hashing is more than just a smarter way to distribute keys. It is a core scalability mechanism that:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;stabilizes systems during cluster topology changes,&lt;/li&gt;
&lt;li&gt;ensures fair distribution of load across nodes,&lt;/li&gt;
&lt;li&gt;and enables systems to scale horizontally with confidence.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Choosing the right configuration of virtual nodes and replication strategy depends heavily on your traffic patterns, data size, and fault tolerance requirements.&lt;/p&gt;

&lt;p&gt;Design it carefully — because at scale, &lt;strong&gt;consistent hashing becomes part of your system's scalability strategy&lt;/strong&gt;.&lt;/p&gt;

</description>
      <category>systemdesign</category>
      <category>distributedsystems</category>
      <category>backend</category>
      <category>programming</category>
    </item>
    <item>
      <title>Design a Rate Limiter</title>
      <dc:creator>JosephAkayesi</dc:creator>
      <pubDate>Mon, 02 Mar 2026 20:21:29 +0000</pubDate>
      <link>https://dev.to/josephakayesi/design-a-rate-limiter-4bkf</link>
      <guid>https://dev.to/josephakayesi/design-a-rate-limiter-4bkf</guid>
      <description>&lt;h1&gt;
  
  
  Rate Limiting — System Design Deep Dive
&lt;/h1&gt;

&lt;p&gt;A &lt;strong&gt;rate limiter&lt;/strong&gt; is a piece of software that regulates how much traffic a client can send to a server within a given period of time.&lt;/p&gt;

&lt;p&gt;At scale, rate limiting becomes a fundamental building block for building reliable and cost-efficient systems.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why Do We Need Rate Limiting?
&lt;/h2&gt;

&lt;p&gt;Rate limiters provide several important benefits:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;✅ Prevent denial-of-service (DoS) attacks
&lt;/li&gt;
&lt;li&gt;✅ Promote fair usage of shared resources
&lt;/li&gt;
&lt;li&gt;✅ Protect backend services from overload
&lt;/li&gt;
&lt;li&gt;✅ Reduce infrastructure and operational costs
&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Real-World Examples
&lt;/h2&gt;

&lt;p&gt;Rate limiting appears everywhere in modern applications:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Users can share &lt;strong&gt;up to 150 posts per day&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Users can post &lt;strong&gt;300 tweets within 2 hours&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Users can make &lt;strong&gt;two withdrawal transactions within 15 seconds&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In reality, rate limits depend entirely on your application's access patterns and business rules.&lt;/p&gt;




&lt;h2&gt;
  
  
  Where Can Rate Limiters Live?
&lt;/h2&gt;

&lt;p&gt;Rate limiters can be deployed in different parts of the system:&lt;/p&gt;

&lt;h3&gt;
  
  
  1️⃣ Client-Side Rate Limiting
&lt;/h3&gt;

&lt;p&gt;Client-side rate limiting happens within the application itself.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pros&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Reduces unnecessary requests early&lt;/li&gt;
&lt;li&gt;Improves perceived responsiveness&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Cons&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Less secure&lt;/li&gt;
&lt;li&gt;Clients can tamper with requests and bypass restrictions&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  2️⃣ Server-Side Rate Limiting
&lt;/h3&gt;

&lt;p&gt;Server-side rate limiting enforces rules centrally.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pros&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Strong enforcement&lt;/li&gt;
&lt;li&gt;Cannot be bypassed easily&lt;/li&gt;
&lt;li&gt;Reliable tracking of usage&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fw2s87hjubkvkgxeyncl2.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fw2s87hjubkvkgxeyncl2.png" alt="Server-side limiter 1" width="649" height="122"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h3&gt;
  
  
  3️⃣ API Gateway / Middleware Layer
&lt;/h3&gt;

&lt;p&gt;A very common approach is placing the rate limiter at the &lt;strong&gt;API gateway&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;This allows all incoming traffic to be evaluated before reaching backend services.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2w8bqu1acxbwq4ojpoym.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2w8bqu1acxbwq4ojpoym.png" alt="Server-side limiter 2" width="648" height="201"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F49v1qykgi65bxcd9k03a.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F49v1qykgi65bxcd9k03a.png" alt="Gateway limiter" width="648" height="186"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Core Rate Limiting Algorithms
&lt;/h2&gt;

&lt;p&gt;Most industry rate limiters are based on a few well-known algorithms.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Fixed Window Counter&lt;/li&gt;
&lt;li&gt;Sliding Window Counter&lt;/li&gt;
&lt;li&gt;Token Bucket&lt;/li&gt;
&lt;li&gt;Leaky Bucket&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Let’s walk through each one.&lt;/p&gt;




&lt;h2&gt;
  
  
  Fixed Window Counter
&lt;/h2&gt;

&lt;p&gt;In a fixed window counter, clients can make a specific number of requests within a fixed time interval.&lt;/p&gt;

&lt;p&gt;Example:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;100 requests per minute&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Problem: Burstiness
&lt;/h3&gt;

&lt;p&gt;A client could send:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;100 requests at &lt;code&gt;00:59&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Another 100 requests at &lt;code&gt;01:00&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Result: &lt;strong&gt;200 requests within seconds&lt;/strong&gt;, even though the limit is 100 per minute.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffunho0u4qgx0xupq5oou.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffunho0u4qgx0xupq5oou.png" alt="Fixed window" width="703" height="329"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Sliding Window Counter
&lt;/h2&gt;

&lt;p&gt;Instead of resetting counters at fixed intervals, the sliding window evaluates requests relative to the &lt;strong&gt;current time&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;When a request arrives:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The system checks how many requests occurred during the previous time window.&lt;/li&gt;
&lt;li&gt;If the limit is exceeded, the request is rejected.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This significantly reduces burst traffic compared to fixed windows.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwzh622jel9kgmc9shqkw.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwzh622jel9kgmc9shqkw.png" alt="Sliding window" width="671" height="279"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Token Bucket
&lt;/h2&gt;

&lt;p&gt;The token bucket is one of the most widely used rate limiting algorithms.&lt;/p&gt;

&lt;h3&gt;
  
  
  How it works
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;A bucket contains tokens.&lt;/li&gt;
&lt;li&gt;Each token allows one request.&lt;/li&gt;
&lt;li&gt;Tokens refill at a constant rate.&lt;/li&gt;
&lt;li&gt;Requests consume tokens.&lt;/li&gt;
&lt;li&gt;If no tokens remain → request is rejected.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Burst traffic is allowed as long as tokens are available.&lt;/p&gt;

&lt;p&gt;More expensive operations can consume multiple tokens.&lt;/p&gt;

&lt;p&gt;This makes token buckets ideal for high-traffic APIs.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Feyeajvlcjsaxfg1ilpuy.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Feyeajvlcjsaxfg1ilpuy.png" alt="Token bucket" width="569" height="359"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2wd3i752awlpkzr05p0f.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2wd3i752awlpkzr05p0f.png" alt="Token bucket flow" width="569" height="466"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqe7siqkmjed2m48ym9as.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqe7siqkmjed2m48ym9as.png" alt="Token bucket example" width="551" height="527"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Leaky Bucket
&lt;/h2&gt;

&lt;p&gt;The leaky bucket processes requests at a constant rate.&lt;/p&gt;

&lt;p&gt;Think of it as a FIFO queue:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Requests enter the queue&lt;/li&gt;
&lt;li&gt;Requests are processed steadily&lt;/li&gt;
&lt;li&gt;When the queue is full, new requests are dropped&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This smooths traffic spikes and ensures consistent processing.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnq9yo335lkt5qn88pq7s.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnq9yo335lkt5qn88pq7s.png" alt="Leaky bucket" width="701" height="205"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  High-Level Architecture
&lt;/h2&gt;

&lt;p&gt;A rate limiter typically acts as middleware between clients and servers.&lt;/p&gt;

&lt;p&gt;Every incoming request is evaluated before reaching the API.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyk8ye2webeinqq6ikcdz.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyk8ye2webeinqq6ikcdz.png" alt="Architecture" width="697" height="242"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;If a request exceeds limits, the server responds with:&lt;br&gt;
HTTP 429 — Too Many Requests&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fuadbcv4plnqyzobas075.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fuadbcv4plnqyzobas075.png" alt="429 response" width="648" height="186"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Helpful Rate Limit Headers
&lt;/h2&gt;

&lt;p&gt;Servers often return headers to help clients behave correctly:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Header&lt;/th&gt;
&lt;th&gt;Meaning&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;X-RateLimit-Remaining&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Remaining allowed requests&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;X-RateLimit-Limit&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Maximum allowed requests&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;X-RateLimit-Retry-After&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Seconds before retrying&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxaqhb90mxj9053q826d7.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxaqhb90mxj9053q826d7.png" alt="Headers" width="697" height="242"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Rule Configuration
&lt;/h2&gt;

&lt;p&gt;Rate limiting rules define what is allowed.&lt;/p&gt;

&lt;p&gt;Example:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Maximum &lt;strong&gt;5 marketing messages per day&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Maximum &lt;strong&gt;5 login attempts per minute&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Rules are typically:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Stored on disk or configuration services&lt;/li&gt;
&lt;li&gt;Loaded into cache by workers&lt;/li&gt;
&lt;li&gt;Evaluated in middleware during requests&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Request Flow
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;Client sends request&lt;/li&gt;
&lt;li&gt;Request reaches rate limiter middleware&lt;/li&gt;
&lt;li&gt;Rules are loaded from cache&lt;/li&gt;
&lt;li&gt;Counters and timestamps are checked&lt;/li&gt;
&lt;li&gt;Request is either forwarded or throttled&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  Rate Limiting in Distributed Systems
&lt;/h2&gt;

&lt;p&gt;Scaling introduces new challenges.&lt;/p&gt;

&lt;h3&gt;
  
  
  Race Conditions
&lt;/h3&gt;

&lt;p&gt;Multiple concurrent requests may update counters simultaneously.&lt;/p&gt;

&lt;p&gt;Example:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Limit = 3 requests/sec&lt;/li&gt;
&lt;li&gt;Two threads read counter value = 2&lt;/li&gt;
&lt;li&gt;Both allow requests → limit exceeded&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Solutions:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Atomic operations&lt;/li&gt;
&lt;li&gt;Redis sorted sets&lt;/li&gt;
&lt;li&gt;Distributed locks (with performance tradeoffs)&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  Synchronization Problems
&lt;/h3&gt;

&lt;p&gt;In distributed systems:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Requests may hit different servers&lt;/li&gt;
&lt;li&gt;Replication lag causes stale counters&lt;/li&gt;
&lt;li&gt;Limits become inconsistent&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Sticky sessions can help but are usually avoided due to operational complexity.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fq8xb8wxaq8yvmiqn5uo4.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fq8xb8wxaq8yvmiqn5uo4.png" alt="Sticky sessions" width="630" height="221"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Centralized Rate Limiting (Global Cache)
&lt;/h2&gt;

&lt;p&gt;A common solution is using a centralized datastore like Redis.&lt;/p&gt;

&lt;p&gt;All nodes read and update shared counters.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9bwpbp82oikol8z4hjgu.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9bwpbp82oikol8z4hjgu.png" alt="Global cache" width="630" height="221"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Tradeoffs:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Potential single point of failure&lt;/li&gt;
&lt;li&gt;Increased latency for global users&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Performance Optimization
&lt;/h2&gt;

&lt;p&gt;A better large-scale solution is a &lt;strong&gt;multi–data center architecture&lt;/strong&gt;.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Deploy rate limiter nodes close to users&lt;/li&gt;
&lt;li&gt;Maintain regional counters&lt;/li&gt;
&lt;li&gt;Synchronize data using eventual consistency&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Benefits:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Reduced latency&lt;/li&gt;
&lt;li&gt;Improved user experience&lt;/li&gt;
&lt;li&gt;Better global scalability&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Monitoring and Observability
&lt;/h2&gt;

&lt;p&gt;After deployment, monitoring is critical.&lt;/p&gt;

&lt;p&gt;Track:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Rate limit hit frequency&lt;/li&gt;
&lt;li&gt;False positives&lt;/li&gt;
&lt;li&gt;Traffic patterns&lt;/li&gt;
&lt;li&gt;Algorithm effectiveness&lt;/li&gt;
&lt;li&gt;User impact&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Rate limiting is not a “set and forget” system — it requires continuous tuning.&lt;/p&gt;




&lt;h2&gt;
  
  
  Final Thoughts
&lt;/h2&gt;

&lt;p&gt;Rate limiting is more than just protecting APIs from abuse. It is a core reliability mechanism that:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;stabilizes systems under load,&lt;/li&gt;
&lt;li&gt;ensures fairness,&lt;/li&gt;
&lt;li&gt;and controls operational costs.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Choosing the right algorithm and architecture depends heavily on your traffic patterns, scale, and consistency requirements.&lt;/p&gt;

&lt;p&gt;Design it carefully — because at scale, &lt;strong&gt;rate limiting becomes part of your system’s resilience strategy&lt;/strong&gt;.&lt;/p&gt;

</description>
      <category>architecture</category>
      <category>backend</category>
      <category>security</category>
      <category>systemdesign</category>
    </item>
    <item>
      <title>AWS Secrets Manager Agent</title>
      <dc:creator>JosephAkayesi</dc:creator>
      <pubDate>Thu, 12 Feb 2026 10:51:36 +0000</pubDate>
      <link>https://dev.to/josephakayesi/aws-secrets-manager-agent-2hpg</link>
      <guid>https://dev.to/josephakayesi/aws-secrets-manager-agent-2hpg</guid>
      <description>&lt;h2&gt;
  
  
  AWS Secrets Manager Agent
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpvete0fpd69nqod2559k.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpvete0fpd69nqod2559k.png" alt="AWS Secrets Manager Agent Simple Architecture" width="800" height="298"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  What is the AWS Secrets Manager Agent?
&lt;/h2&gt;

&lt;p&gt;When building applications, you often need to provide developers with access to sensitive information such as database credentials, API keys, or authentication tokens. However, you do not want these secrets shared insecurely (for example, via email or hardcoded in source code).&lt;/p&gt;

&lt;p&gt;AWS Secrets Manager allows you to securely store and manage secrets. Your application can then retrieve these secrets at runtime to connect to services such as:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Amazon RDS
&lt;/li&gt;
&lt;li&gt;Amazon DocumentDB
&lt;/li&gt;
&lt;li&gt;Third-party APIs
&lt;/li&gt;
&lt;li&gt;Internal services
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Typically, the application calls AWS Secrets Manager directly to retrieve the secret whenever it needs it.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Scaling Problem
&lt;/h2&gt;

&lt;p&gt;For applications with a small user base, retrieving secrets directly from AWS Secrets Manager works well.&lt;/p&gt;

&lt;p&gt;However, as your system scales, this approach can become inefficient:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;If your application retrieves a secret on every request
&lt;/li&gt;
&lt;li&gt;And your system handles a large number of requests (for example, hundreds of thousands or millions)
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;You may end up making an extremely high number of API calls to Secrets Manager.&lt;/p&gt;

&lt;p&gt;This can lead to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Increased latency
&lt;/li&gt;
&lt;li&gt;Higher costs
&lt;/li&gt;
&lt;li&gt;API rate limiting
&lt;/li&gt;
&lt;li&gt;Potential throttling
&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  How the AWS Secrets Manager Agent Solves This
&lt;/h2&gt;

&lt;p&gt;The AWS Secrets Manager Agent addresses this issue by acting as a local caching layer.&lt;/p&gt;

&lt;p&gt;According to the official documentation:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;“The AWS Secrets Manager Agent is a local HTTP service that you can install and use in your compute environments to read secrets from Secrets Manager and cache them in memory.”&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Instead of your application calling AWS Secrets Manager directly:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;The application makes a request to a local HTTP endpoint (for example, &lt;code&gt;localhost&lt;/code&gt;).
&lt;/li&gt;
&lt;li&gt;The agent retrieves the secret from AWS Secrets Manager.
&lt;/li&gt;
&lt;li&gt;The agent caches the secret in memory.
&lt;/li&gt;
&lt;li&gt;Subsequent requests are served from the in-memory cache.
&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This significantly reduces the number of direct calls to AWS Secrets Manager and helps prevent rate limiting.&lt;/p&gt;




&lt;h2&gt;
  
  
  Where It Can Be Used
&lt;/h2&gt;

&lt;p&gt;The Secrets Manager Agent is a client-side HTTP service that standardizes secret retrieval across compute environments, including:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;AWS EC2
&lt;/li&gt;
&lt;li&gt;Amazon ECS
&lt;/li&gt;
&lt;li&gt;Amazon EKS
&lt;/li&gt;
&lt;li&gt;AWS Lambda
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Because it exposes an HTTP interface, it is language-agnostic and works with any application stack.&lt;/p&gt;




&lt;h2&gt;
  
  
  Configuration Options
&lt;/h2&gt;

&lt;p&gt;The Secrets Manager Agent can be configured with:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Maximum number of connections
&lt;/li&gt;
&lt;li&gt;Cache time-to-live (TTL)
&lt;/li&gt;
&lt;li&gt;Localhost HTTP port
&lt;/li&gt;
&lt;li&gt;Cache size
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This allows you to control performance, memory usage, and secret refresh behavior.&lt;/p&gt;




&lt;h2&gt;
  
  
  Benefits of Using the Secrets Manager Agent
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Client-side HTTP caching layer
&lt;/li&gt;
&lt;li&gt;Standardized secret consumption across compute types
&lt;/li&gt;
&lt;li&gt;Works with EC2, ECS, EKS, and Lambda
&lt;/li&gt;
&lt;li&gt;Language-agnostic and open source
&lt;/li&gt;
&lt;li&gt;Can fetch live credentials, reducing the need for container restarts when using static environment variables
&lt;/li&gt;
&lt;li&gt;Built-in protection against server-side request forgery (SSRF)
&lt;/li&gt;
&lt;li&gt;Post-quantum TLS enabled by default
&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  When Should You Use the Secrets Manager Agent?
&lt;/h2&gt;

&lt;p&gt;Use the Secrets Manager Agent when:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Your application frequently retrieves secrets
&lt;/li&gt;
&lt;li&gt;You are operating at scale and want to avoid rate limiting
&lt;/li&gt;
&lt;li&gt;You want to reduce latency caused by repeated API calls
&lt;/li&gt;
&lt;li&gt;You want a standardized way to consume secrets across multiple compute environments
&lt;/li&gt;
&lt;li&gt;You want to avoid restarting containers when secrets rotate
&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  When You May Not Need It
&lt;/h2&gt;

&lt;p&gt;You may not need the Secrets Manager Agent if:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Your application retrieves secrets only once at startup
&lt;/li&gt;
&lt;li&gt;Your traffic is low and rate limits are not a concern
&lt;/li&gt;
&lt;li&gt;You are already using another secure caching mechanism
&lt;/li&gt;
&lt;li&gt;You inject secrets at deployment time and do not require runtime retrieval
&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Example Usage Scenario
&lt;/h2&gt;

&lt;p&gt;Imagine an application running on Amazon ECS that connects to an RDS database.&lt;/p&gt;

&lt;h3&gt;
  
  
  Without the Agent
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Each container retrieves database credentials directly from AWS Secrets Manager.
&lt;/li&gt;
&lt;li&gt;Under high load, this can result in excessive API calls.
&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  With the Agent
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;The container queries the local Secrets Manager Agent endpoint.
&lt;/li&gt;
&lt;li&gt;The agent retrieves and caches the database credentials.
&lt;/li&gt;
&lt;li&gt;Subsequent requests are served from memory.
&lt;/li&gt;
&lt;li&gt;API calls to AWS Secrets Manager are significantly reduced.
&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Summary
&lt;/h2&gt;

&lt;p&gt;The AWS Secrets Manager Agent is a local HTTP caching service that improves scalability, reduces latency, and helps prevent rate limiting when retrieving secrets from AWS Secrets Manager.&lt;/p&gt;

&lt;p&gt;It is particularly useful for high-traffic, distributed systems where secrets are accessed frequently and must be securely managed without compromising performance.&lt;/p&gt;

</description>
      <category>aws</category>
      <category>security</category>
      <category>cloudnative</category>
    </item>
    <item>
      <title>A Framework for System Design Interviews</title>
      <dc:creator>JosephAkayesi</dc:creator>
      <pubDate>Sat, 07 Feb 2026 09:37:06 +0000</pubDate>
      <link>https://dev.to/josephakayesi/a-framework-for-system-design-interviews-gde</link>
      <guid>https://dev.to/josephakayesi/a-framework-for-system-design-interviews-gde</guid>
      <description>&lt;p&gt;System design interviews often feel vague at a high level. You may be asked to design a large-scale system, one that took years to build, in under an hour. Clearly, it is impossible to design such a system in full detail within that time. So what, then, is the real purpose of a system design interview?&lt;/p&gt;

&lt;p&gt;A system design interview is not about building a production-ready system. It is an exercise designed to evaluate your ability to reason about complex problems, structure a solution, and explain design decisions along with their tradeoffs. Interviewers are far more interested in how you think than in the final architecture you propose.&lt;/p&gt;

&lt;p&gt;Every system design problem is different, and there is no one-size-fits-all solution. That said, there are common steps you can follow to approach most system design interviews effectively.&lt;/p&gt;

&lt;p&gt;Below is a simple and practical template for navigating system design interviews.&lt;/p&gt;

&lt;h2&gt;
  
  
  A 4-step process for effective system design interviews
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Step-1 Understand the problem and establish design scope**
&lt;/h3&gt;

&lt;p&gt;Do not rush into designing the system before fully understanding the problem. Start by clarifying the requirements and constraints.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Key questions to ask include:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;What does the system do?&lt;/li&gt;
&lt;li&gt;How many users will the system support?&lt;/li&gt;
&lt;li&gt;What guarantees must the system provide?

&lt;ul&gt;
&lt;li&gt;Availability?&lt;/li&gt;
&lt;li&gt;Consistency?&lt;/li&gt;
&lt;li&gt;Scalability?&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;What is the expected growth pattern?&lt;/li&gt;

&lt;li&gt;Are there existing services or components we can leverage?&lt;/li&gt;

&lt;/ul&gt;

&lt;p&gt;These questions help define the scope of the system and guide your design decisions and tradeoffs.&lt;/p&gt;

&lt;p&gt;Note that these are only sample questions. The exact questions you ask will depend on the system you are asked to design. Sometimes the interviewer will provide direct answers, and other times you will be expected to make reasonable assumptions.&lt;/p&gt;

&lt;p&gt;As you go, clearly document all requirements and assumptions. This ensures you do not miss important details and gives you a solid reference point to design against.&lt;/p&gt;




&lt;p&gt;Example:&lt;br&gt;
Say you are asked to design a news feed system. &lt;/p&gt;

&lt;p&gt;The conversation between you and your interview may look something like this. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Interviewer:&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
Design a news feed system.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Candidate:&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
Before I start designing, I would like to clarify the core requirements. What are the main features we want to support?&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Interviewer:&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
Users should be able to see a personalized feed of posts from accounts they follow.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Candidate:&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
Understood. To size the system properly, how many users do we expect to support?&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Interviewer:&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
Around 100 million monthly active users, with about 20 percent daily active users.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Candidate:&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
So roughly 20 million daily active users. How frequently do users create posts?&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Interviewer:&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
Each active user creates about one post per day.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Candidate:&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
How often do users read or refresh their feed compared to posting?&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Interviewer:&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
Users read their feed much more frequently, about 50 feed reads per post.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Candidate:&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
What are the latency and freshness requirements?&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Interviewer:&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
Feed loads should be under 200 milliseconds, and new posts should appear in followers’ feeds within a few seconds.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 2: Propose a high-level design and get buy-in
&lt;/h3&gt;

&lt;p&gt;Once the problem and scope are clear, propose a high-level design. At this stage, treat the interviewer as a teammate and align on the overall approach before going deeper.&lt;/p&gt;

&lt;p&gt;Ask yourself what the main components of the system are and how they interact to solve the problem.&lt;/p&gt;

&lt;p&gt;For the news feed example, the system can be broken down into two main flows:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Feed publishing:&lt;/strong&gt; A user creates a post, which is validated and persisted.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Feed building:&lt;/strong&gt; A user’s feed is generated by aggregating posts from followed accounts, typically in reverse chronological order.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This level of abstraction ensures you and the interviewer agree on the system’s structure before diving into details.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm9oodk5gxwsyxc4g4h14.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm9oodk5gxwsyxc4g4h14.png" alt=" " width="479" height="677"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fe9vwej5291zmy5x3n4tl.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fe9vwej5291zmy5x3n4tl.png" alt=" " width="497" height="612"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 3: Deep dive into key components
&lt;/h3&gt;

&lt;p&gt;After agreeing on the high-level design, prioritize the most critical components and dive deeper into them. Focus on areas that are central to the system’s functional requirements.&lt;/p&gt;

&lt;p&gt;This is where you discuss:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Data models&lt;/li&gt;
&lt;li&gt;Read versus write tradeoffs&lt;/li&gt;
&lt;li&gt;Scalability strategies&lt;/li&gt;
&lt;li&gt;Caching&lt;/li&gt;
&lt;li&gt;Bottlenecks and failure modes&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Explain why you choose certain approaches and what tradeoffs they introduce. This step is often the core of the interview.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fl1xmkwjoetvtk8g4k12f.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fl1xmkwjoetvtk8g4k12f.png" alt=" " width="569" height="585"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fk18gsnbd0u6euc0ksukv.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fk18gsnbd0u6euc0ksukv.png" alt=" " width="585" height="446"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 4: Wrap up and discuss improvements
&lt;/h3&gt;

&lt;p&gt;In the final stage, the interviewer may ask follow-up questions such as:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;How would you improve the system?&lt;/li&gt;
&lt;li&gt;What happens if the user base grows by 100 times?&lt;/li&gt;
&lt;li&gt;What are the main bottlenecks?&lt;/li&gt;
&lt;li&gt;Where does the system fall short?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;No system is perfect. Being open about limitations and discussing how you would address them leaves a strong impression and demonstrates maturity in system design thinking.&lt;/p&gt;

</description>
      <category>systemdesign</category>
      <category>cloudnative</category>
      <category>distributedsystems</category>
    </item>
    <item>
      <title>Prefix sums and range queries</title>
      <dc:creator>JosephAkayesi</dc:creator>
      <pubDate>Mon, 26 Jan 2026 14:52:23 +0000</pubDate>
      <link>https://dev.to/josephakayesi/prefix-sums-and-range-queries-260f</link>
      <guid>https://dev.to/josephakayesi/prefix-sums-and-range-queries-260f</guid>
      <description>&lt;p&gt;Discovering prefix sums is one of those moments where you go &lt;em&gt;aha!&lt;/em&gt;&lt;br&gt;&lt;br&gt;
It’s staggering how memory-efficient this technique is.&lt;/p&gt;

&lt;p&gt;Prefix sums are used when you want to compute values over a range —&lt;br&gt;&lt;br&gt;
i.e. &lt;strong&gt;range queries&lt;/strong&gt;.&lt;/p&gt;


&lt;h2&gt;
  
  
  Example
&lt;/h2&gt;

&lt;p&gt;Let’s say you want to compute the number of YouTube views over different time periods.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;views&lt;/span&gt;   &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;7&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;periods&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;]]&lt;/span&gt;

&lt;span class="n"&gt;output&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;23&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;i&lt;/code&gt; represents the &lt;em&gt;i-th day&lt;/em&gt; in the &lt;code&gt;views&lt;/code&gt; array
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;views[i]&lt;/code&gt; is the number of views on that day
&lt;/li&gt;
&lt;li&gt;Each element in &lt;code&gt;periods&lt;/code&gt; represents a range over which we want the cumulative sum
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For example, &lt;code&gt;periods[0] = [0, 1]&lt;/code&gt; means:&lt;br&gt;&lt;br&gt;
compute the cumulative views from day &lt;code&gt;0&lt;/code&gt; to day &lt;code&gt;1&lt;/code&gt; (inclusive).&lt;/p&gt;


&lt;h2&gt;
  
  
  A naïve approach
&lt;/h2&gt;

&lt;p&gt;A valid approach would be:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;For each period, find the sub-array for that range
&lt;/li&gt;
&lt;li&gt;Iterate through it and sum all the elements
&lt;/li&gt;
&lt;li&gt;Append the result to an output array
&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The problem with this approach is that we recompute &lt;strong&gt;many overlapping ranges&lt;/strong&gt; without keeping track of previous work.&lt;/p&gt;

&lt;p&gt;Can we do better?&lt;/p&gt;


&lt;h2&gt;
  
  
  Prefix sums to the rescue 🚀
&lt;/h2&gt;

&lt;p&gt;Yes. An optimal approach is to use a &lt;strong&gt;prefix sum&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;The idea is simple:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Compute a cumulative sum array once
&lt;/li&gt;
&lt;li&gt;Each element at index &lt;code&gt;i&lt;/code&gt; stores the sum of values from index &lt;code&gt;0&lt;/code&gt; to &lt;code&gt;i&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Let’s call this array &lt;code&gt;prefix&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;prefix[0] = views[0]
prefix[1] = views[0] + views[1]
prefix[2] = views[0] + views[1] + views[2]
prefix[3] = views[0] + views[1] + views[2] + views[3]
...
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For our example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;prefix&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;7&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;15&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;17&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;23&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;31&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Querying ranges with prefix sums
&lt;/h2&gt;

&lt;p&gt;For any period &lt;code&gt;[l, r]&lt;/code&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;If the range starts at day &lt;code&gt;0&lt;/code&gt;, the answer is simply:
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="nb"&gt;sum&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;prefix&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Because each &lt;code&gt;prefix[r]&lt;/code&gt; already represents the cumulative sum from day &lt;code&gt;0&lt;/code&gt; to day &lt;code&gt;r&lt;/code&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  What if the range does &lt;strong&gt;not&lt;/strong&gt; start at 0?
&lt;/h2&gt;

&lt;p&gt;Let’s say the period is &lt;code&gt;[1, 4]&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;We know:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;prefix[4] = views[0] + views[1] + views[2] + views[3] + views[4]
prefix[0] = views[0]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;So to get the sum from day &lt;code&gt;1&lt;/code&gt; to day &lt;code&gt;4&lt;/code&gt;, we subtract what came before the range:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="nb"&gt;sum&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;prefix&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;prefix&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;l&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In general:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="nb"&gt;sum&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;prefix&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;prefix&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;l&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This works because:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;prefix[r]&lt;/code&gt; gives the total up to the end of the range
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;prefix[l - 1]&lt;/code&gt; gives everything before the range
&lt;/li&gt;
&lt;li&gt;Subtracting removes what we don’t care about
&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Intuition (what’s really happening)
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;prefix[4] = views[0] + views[1] + views[2] + views[3] + views[4]
prefix[0] = views[0]
-----------------------------------------------
result    =          views[1] + views[2] + views[3] + views[4]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Everything before the range is discarded. We only keep what’s inside &lt;code&gt;[l, r]&lt;/code&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  Single-day ranges
&lt;/h2&gt;

&lt;p&gt;For a period like &lt;code&gt;[3, 3]&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;prefix[3] = views[0] + views[1] + views[2] + views[3]
prefix[2] = views[0] + views[1] + views[2]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="nb"&gt;sum&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;prefix&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;prefix&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="nb"&gt;sum&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;15&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;
&lt;span class="nb"&gt;sum&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Which is exactly &lt;code&gt;views[3]&lt;/code&gt;.&lt;/p&gt;




&lt;p&gt;That’s the magic of prefix sums:&lt;br&gt;&lt;br&gt;
&lt;strong&gt;precompute once, answer every range query in O(1).&lt;/strong&gt;&lt;/p&gt;

</description>
      <category>algorithms</category>
      <category>prefixsum</category>
      <category>python</category>
      <category>datastructures</category>
    </item>
    <item>
      <title>DAGs &amp; Topological Sorting</title>
      <dc:creator>JosephAkayesi</dc:creator>
      <pubDate>Sat, 24 Jan 2026 13:26:19 +0000</pubDate>
      <link>https://dev.to/josephakayesi/dags-topological-sorting-1o9k</link>
      <guid>https://dev.to/josephakayesi/dags-topological-sorting-1o9k</guid>
      <description>&lt;h2&gt;
  
  
  Directed acyclic graph (DAG)
&lt;/h2&gt;

&lt;p&gt;These are graphs that do not have cycles. &lt;br&gt;
Hence they can be topologically sorted. &lt;br&gt;
When graph is arranged from the starting node it moves in one direction. &lt;br&gt;
These graphs do not visit their predecessors; only successors. &lt;/p&gt;

&lt;h2&gt;
  
  
  Topological sorting
&lt;/h2&gt;

&lt;p&gt;DAGs come up a lot when we speak about topological sorting. &lt;/p&gt;

&lt;p&gt;Topological sorting is a really cool technique for solving a myriad of problems. &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;1. Prerequsites for courses in college&lt;/li&gt;
&lt;li&gt;2. CICD pipelines where one stage needs to run before the other. &lt;/li&gt;
&lt;li&gt;3. Data engineering pipelines&lt;/li&gt;
&lt;li&gt;4. Git commit log graph&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The pattern that comes up for DAGs are that all nodes depend forward. &lt;/p&gt;

&lt;p&gt;Any problem that has this pattern can be solved using topological sorting. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgvdnxxkaetkqm7dzw48k.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgvdnxxkaetkqm7dzw48k.png" alt=" " width="800" height="241"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Notice that all arrows move in one direction. &lt;br&gt;
None of the arrows depends backwards. &lt;br&gt;
All nodes depend forward. &lt;/p&gt;

</description>
      <category>algorithms</category>
      <category>computerscience</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>🌲 Finding the Longest Aligned Chain in a Binary Tree</title>
      <dc:creator>JosephAkayesi</dc:creator>
      <pubDate>Sat, 24 Jan 2026 09:56:32 +0000</pubDate>
      <link>https://dev.to/josephakayesi/finding-the-longest-aligned-chain-in-a-binary-tree-h28</link>
      <guid>https://dev.to/josephakayesi/finding-the-longest-aligned-chain-in-a-binary-tree-h28</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fp6ilu00y1645snqb0tbv.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fp6ilu00y1645snqb0tbv.png" alt=" " width="800" height="800"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  The Challenge
&lt;/h2&gt;

&lt;p&gt;Given a binary tree, we say a node is &lt;strong&gt;aligned&lt;/strong&gt; if its value is the same as its depth (where the root is at depth 0). Our goal is to return the length of the longest descendant chain of aligned nodes.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The chain must follow a parent-to-child path, but it does not need to start at the root.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  Thought Process
&lt;/h2&gt;

&lt;p&gt;To find the longest chain, we need to traverse the tree while tracking our current depth. A &lt;strong&gt;Bottom-Up Depth First Search (DFS)&lt;/strong&gt; works best here:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;State Tracking:&lt;/strong&gt; Pass the current &lt;code&gt;depth&lt;/code&gt; down as we move through the tree.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Recursive Step:&lt;/strong&gt; For any node, the longest aligned chain starting at that node depends on the results of its left and right children.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;The Logic:&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;If &lt;code&gt;node.value == depth&lt;/code&gt;, this node is aligned. Its chain length is &lt;code&gt;1 + max(left_child_chain, right_child_chain)&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;&lt;p&gt;If &lt;code&gt;node.value != depth&lt;/code&gt;, the chain breaks. We return &lt;code&gt;0&lt;/code&gt; to the parent, but we must have already recorded the maximum found so far elsewhere.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Global Max:&lt;/strong&gt; Use a member variable to track the maximum chain length encountered during the entire traversal.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;Example:&lt;/strong&gt;&lt;br&gt;
Input Array: &lt;code&gt;[7, 1, 3, 2, 8, 2, None, 4, 3, None, None, 3, 3]&lt;/code&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Depth 1: Node(1) is aligned.&lt;/li&gt;
&lt;li&gt;Depth 2: Node(2) is aligned.&lt;/li&gt;
&lt;li&gt;Depth 3: Node(3) is aligned.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Result:&lt;/strong&gt; 3 (Chain: 1 → 2 → 3)&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Final Solution
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;typing&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Optional&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;collections&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;deque&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;TreeNode&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;value&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;value&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;left&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Optional&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;TreeNode&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;right&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Optional&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;TreeNode&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Solution&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;res&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;alignedChained&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;root&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Optional&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;TreeNode&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;dfs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;node&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Optional&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;TreeNode&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;depth&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;node&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;

            &lt;span class="c1"&gt;# Post-order traversal: Get values from children first
&lt;/span&gt;            &lt;span class="n"&gt;left&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;dfs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;node&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;left&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;depth&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;right&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;dfs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;node&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;right&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;depth&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;node&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;value&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;depth&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="c1"&gt;# Node matches depth, extend the chain
&lt;/span&gt;                &lt;span class="n"&gt;current_chain&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="nf"&gt;max&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;left&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;right&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;res&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;max&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;res&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;current_chain&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;current_chain&lt;/span&gt;

            &lt;span class="c1"&gt;# Chain breaks here
&lt;/span&gt;            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;

        &lt;span class="nf"&gt;dfs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;root&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;res&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h3&gt;
  
  
  Key Takeaways
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;In tree problems, always identify if you need information from parents (Top-down) or children (Bottom-up). Here, we needed both (depth from parent, chain length from children).&lt;/li&gt;
&lt;li&gt;The "chain" logic is very similar to the "Longest Path" or "Diameter" problems on LeetCode.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Related Problems
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Binary Tree Maximum Path Sum&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Longest Univalue Path&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Diameter of Binary Tree&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Tags
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;#python&lt;/code&gt; &lt;code&gt;#algorithms&lt;/code&gt; &lt;code&gt;#trees&lt;/code&gt; &lt;code&gt;#dfs&lt;/code&gt; &lt;code&gt;#recursion&lt;/code&gt;&lt;/p&gt;

</description>
      <category>algorithms</category>
      <category>coding</category>
      <category>computerscience</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>Scale from Zero to Millions of Users</title>
      <dc:creator>JosephAkayesi</dc:creator>
      <pubDate>Mon, 19 Jan 2026 09:23:43 +0000</pubDate>
      <link>https://dev.to/josephakayesi/scale-from-zero-to-millions-of-users-3d5m</link>
      <guid>https://dev.to/josephakayesi/scale-from-zero-to-millions-of-users-3d5m</guid>
      <description>&lt;h1&gt;
  
  
  Scale from Zero to Millions of Users
&lt;/h1&gt;

&lt;p&gt;&lt;em&gt;By Joseph Akayesi&lt;/em&gt;&lt;br&gt;&lt;br&gt;
&lt;em&gt;Originally published on Medium:&lt;/em&gt; &lt;a href="https://medium.com/@josephakayesi/scale-from-zero-to-millions-of-users-3e91daa771d9" rel="noopener noreferrer"&gt;https://medium.com/@josephakayesi/scale-from-zero-to-millions-of-users-3e91daa771d9&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Designing systems to support millions of users can be challenging. It requires refinement and continuous improvement. In this article, we’ll walk through how to design a system that scales from &lt;strong&gt;zero users to millions of users&lt;/strong&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  Single Server Setup
&lt;/h2&gt;

&lt;p&gt;At the beginning, a &lt;strong&gt;single-server setup&lt;/strong&gt; is enough to serve a small number of users. Every component required to service requests exists on this single server, including:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Web server
&lt;/li&gt;
&lt;li&gt;Database
&lt;/li&gt;
&lt;li&gt;Cache
&lt;/li&gt;
&lt;li&gt;Other supporting services
&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Request Flow in a Single Server Setup
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;A user makes a request to your website using a &lt;strong&gt;URL&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;DNS resolution&lt;/strong&gt; occurs and translates the URL to an IP address.&lt;/li&gt;
&lt;li&gt;The user sends a request to the web server using that IP address.&lt;/li&gt;
&lt;li&gt;The web server processes the request and responds with an &lt;strong&gt;HTML page&lt;/strong&gt;.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This setup works well initially but quickly becomes insufficient as traffic grows.&lt;/p&gt;




&lt;h2&gt;
  
  
  Separating the Database Layer
&lt;/h2&gt;

&lt;p&gt;A single-server setup cannot handle increasing traffic efficiently. As the user base grows, the &lt;strong&gt;web application layer must be separated from the database layer&lt;/strong&gt; so both can scale independently.&lt;/p&gt;




&lt;h2&gt;
  
  
  Which Databases Should You Use?
&lt;/h2&gt;

&lt;p&gt;There are generally two main types of databases:&lt;/p&gt;

&lt;h3&gt;
  
  
  Relational Databases
&lt;/h3&gt;

&lt;p&gt;Relational databases maintain &lt;strong&gt;strict referential integrity&lt;/strong&gt;. They enforce relationships between tables and organize data into rows and columns.&lt;/p&gt;

&lt;p&gt;Popular relational databases include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;MySQL
&lt;/li&gt;
&lt;li&gt;PostgreSQL
&lt;/li&gt;
&lt;li&gt;OracleDB
&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Non-Relational (NoSQL) Databases
&lt;/h3&gt;

&lt;p&gt;Non-relational databases provide &lt;strong&gt;looser referential integrity&lt;/strong&gt; and are designed to store unstructured or semi-structured data, often as documents or key-value pairs.&lt;/p&gt;

&lt;p&gt;Popular non-relational databases include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;MongoDB
&lt;/li&gt;
&lt;li&gt;DynamoDB
&lt;/li&gt;
&lt;li&gt;CouchDB
&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Vertical Scaling vs Horizontal Scaling
&lt;/h2&gt;

&lt;p&gt;As your user base grows, your system must handle increased traffic. There are two primary scaling strategies:&lt;/p&gt;

&lt;h3&gt;
  
  
  Vertical Scaling
&lt;/h3&gt;

&lt;p&gt;Vertical scaling increases the capacity of a single server by adding more resources such as:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;CPU
&lt;/li&gt;
&lt;li&gt;RAM
&lt;/li&gt;
&lt;li&gt;Storage
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Vertical scaling is simple and works well for &lt;strong&gt;low to moderate traffic&lt;/strong&gt;, but it has a hard upper limit.&lt;/p&gt;

&lt;h3&gt;
  
  
  Horizontal Scaling
&lt;/h3&gt;

&lt;p&gt;Horizontal scaling adds &lt;strong&gt;more servers&lt;/strong&gt; to form a cluster. This approach allows the system to scale beyond the limits of a single machine.&lt;/p&gt;




&lt;h2&gt;
  
  
  Load Balancer
&lt;/h2&gt;

&lt;p&gt;When many users access your system concurrently, servers can become overloaded. A &lt;strong&gt;load balancer&lt;/strong&gt; solves this problem.&lt;/p&gt;

&lt;p&gt;A load balancer sits between clients and servers and distributes incoming requests to the &lt;strong&gt;next available server&lt;/strong&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Types of Load Balancers
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Application Load Balancer (ALB)&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Network Load Balancer (NLB)&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;With a load balancer, the web tier becomes &lt;strong&gt;highly available&lt;/strong&gt;, as requests can be routed across multiple servers.&lt;/p&gt;




&lt;h2&gt;
  
  
  What About the Data Tier?
&lt;/h2&gt;

&lt;p&gt;While the web tier may now be redundant, the data tier often still consists of a &lt;strong&gt;single database&lt;/strong&gt;, making it a single point of failure.&lt;/p&gt;

&lt;p&gt;This problem is addressed using &lt;strong&gt;database replication&lt;/strong&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  Database Replication
&lt;/h2&gt;

&lt;p&gt;Database replication distributes copies of data across multiple machines.&lt;/p&gt;

&lt;h3&gt;
  
  
  Advantages of Database Replication
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Better performance
&lt;/li&gt;
&lt;li&gt;High availability
&lt;/li&gt;
&lt;li&gt;Improved reliability
&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Typical Request Flow with Replication
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;A user gets the load balancer IP from DNS.&lt;/li&gt;
&lt;li&gt;The user connects to the load balancer.&lt;/li&gt;
&lt;li&gt;The request is routed to one of the web servers.&lt;/li&gt;
&lt;li&gt;Read operations go to &lt;strong&gt;replica (slave) databases&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;Write, update, and delete operations go to the &lt;strong&gt;master database&lt;/strong&gt;.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Cache
&lt;/h2&gt;

&lt;p&gt;A typical web application consists of a &lt;strong&gt;web tier&lt;/strong&gt; and a &lt;strong&gt;data tier&lt;/strong&gt;. To improve performance and reduce latency, we introduce a &lt;strong&gt;cache&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;A cache stores frequently accessed or expensive-to-compute data in memory for fast retrieval.&lt;/p&gt;

&lt;p&gt;Popular caching systems include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Redis
&lt;/li&gt;
&lt;li&gt;Memcached
&lt;/li&gt;
&lt;li&gt;Valkey
&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Cache Tier
&lt;/h2&gt;

&lt;p&gt;The cache tier sits between the web servers and the database.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Frequently accessed data is stored in the cache&lt;/li&gt;
&lt;li&gt;Subsequent requests are served directly from the cache&lt;/li&gt;
&lt;li&gt;Database load is significantly reduced&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Common Caching Strategies
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Read-through
&lt;/li&gt;
&lt;li&gt;Read-around
&lt;/li&gt;
&lt;li&gt;Write-back
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Caches typically store data as &lt;strong&gt;key-value pairs&lt;/strong&gt; and support expiration times.&lt;/p&gt;




&lt;h2&gt;
  
  
  Cache Considerations
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Use cache only for &lt;strong&gt;temporary data&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Maintain consistency between cache and database&lt;/li&gt;
&lt;li&gt;Use expiration policies to avoid stale data&lt;/li&gt;
&lt;li&gt;Avoid single points of failure by using cache clusters&lt;/li&gt;
&lt;li&gt;Choose appropriate eviction policies (LRU, LFU, MRU)&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Content Delivery Network (CDN)
&lt;/h2&gt;

&lt;p&gt;A &lt;strong&gt;Content Delivery Network (CDN)&lt;/strong&gt; is a globally distributed network of servers that serve &lt;strong&gt;static content&lt;/strong&gt;, such as:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;HTML files
&lt;/li&gt;
&lt;li&gt;Images and videos
&lt;/li&gt;
&lt;li&gt;JavaScript and CSS files
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;CDNs store copies of static content closer to users, reducing latency and improving performance.&lt;/p&gt;




&lt;h2&gt;
  
  
  CDN Considerations
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;CDNs cost money due to data replication&lt;/li&gt;
&lt;li&gt;Cache expiration must be carefully configured&lt;/li&gt;
&lt;li&gt;Always provide a fallback in case the CDN fails&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Stateless Web Tier
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Stateful Architecture
&lt;/h3&gt;

&lt;p&gt;In a stateful architecture, user session data is stored on the server. This forces users to connect to the &lt;strong&gt;same server&lt;/strong&gt; throughout their session, which limits scalability.&lt;/p&gt;

&lt;h3&gt;
  
  
  Stateless Architecture
&lt;/h3&gt;

&lt;p&gt;In a stateless architecture:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;User session data is stored externally (database or cache)&lt;/li&gt;
&lt;li&gt;Any server in the cluster can handle any request&lt;/li&gt;
&lt;li&gt;Scalability and fault tolerance improve significantly&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Data Centers
&lt;/h2&gt;

&lt;p&gt;Large-scale systems often operate across &lt;strong&gt;multiple data centers&lt;/strong&gt; to enable automatic failover and global availability.&lt;/p&gt;




&lt;h2&gt;
  
  
  Message Queues
&lt;/h2&gt;

&lt;p&gt;Message queues enable &lt;strong&gt;asynchronous processing&lt;/strong&gt; and further scalability.&lt;/p&gt;

&lt;p&gt;They support &lt;strong&gt;event-driven architectures&lt;/strong&gt;, where:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Producers publish events&lt;/li&gt;
&lt;li&gt;Consumers process events asynchronously&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This decouples services and improves reliability, especially for long-running or heavy operations.&lt;/p&gt;




&lt;h2&gt;
  
  
  Logs, Metrics, and Automation
&lt;/h2&gt;

&lt;p&gt;As systems grow, observability becomes critical.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Logs&lt;/strong&gt; help debug issues&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Metrics&lt;/strong&gt; provide insight into performance and capacity&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Automation&lt;/strong&gt; ensures reliability and consistency&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These are essential in large-scale systems with many moving parts.&lt;/p&gt;




&lt;h2&gt;
  
  
  Database Scaling
&lt;/h2&gt;

&lt;p&gt;As data volume grows, the database layer must also scale.&lt;/p&gt;

&lt;h3&gt;
  
  
  Vertical Database Scaling
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Add more CPU, memory, and storage&lt;/li&gt;
&lt;li&gt;Simple but limited by hardware constraints&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Horizontal Database Scaling
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Add more database instances&lt;/li&gt;
&lt;li&gt;Distribute data across instances&lt;/li&gt;
&lt;li&gt;Enables near-infinite scalability&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Sharding
&lt;/h2&gt;

&lt;p&gt;Horizontal database scaling is commonly achieved through &lt;strong&gt;sharding&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Sharding breaks data into smaller partitions distributed across multiple database instances.&lt;/p&gt;

&lt;p&gt;Requests are routed to the correct shard using techniques such as &lt;strong&gt;consistent hashing&lt;/strong&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  Drawbacks of Sharding
&lt;/h2&gt;

&lt;p&gt;While powerful, sharding introduces complexity:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Resharding data is difficult&lt;/li&gt;
&lt;li&gt;Celebrity (hot key) problem&lt;/li&gt;
&lt;li&gt;Joins and normalization become harder&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Careful planning is required before adopting sharding.&lt;/p&gt;




&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Scaling from zero to millions of users is an &lt;strong&gt;incremental journey&lt;/strong&gt;. By evolving your architecture step by step and applying the right techniques at the right time, you can build systems that are scalable, resilient, and performant.&lt;/p&gt;

</description>
      <category>systemdesign</category>
    </item>
  </channel>
</rss>
