<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Poojan</title>
    <description>The latest articles on DEV Community by Poojan (@poojang).</description>
    <link>https://dev.to/poojang</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3986930%2Fcc62937c-0f36-4ae3-bc93-44c620e36da9.png</url>
      <title>DEV Community: Poojan</title>
      <link>https://dev.to/poojang</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/poojang"/>
    <language>en</language>
    <item>
      <title>Scaling PostgreSQL in Production: Indexing, RPC Functions, Views, and Query Optimization</title>
      <dc:creator>Poojan</dc:creator>
      <pubDate>Fri, 05 Jun 2026 00:00:00 +0000</pubDate>
      <link>https://dev.to/poojang/scaling-postgresql-in-production-indexing-rpc-functions-views-and-query-optimization-3j05</link>
      <guid>https://dev.to/poojang/scaling-postgresql-in-production-indexing-rpc-functions-views-and-query-optimization-3j05</guid>
      <description>&lt;p&gt;As applications grow, database performance becomes one of the biggest factors affecting user experience. Features can be implemented quickly, but poorly optimized queries, inefficient data access patterns, and increasing data volume eventually create bottlenecks.&lt;/p&gt;

&lt;p&gt;Recently, I worked on optimizing a multi-tenant PostgreSQL system built on Supabase. The system consisted of:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;99 database tables&lt;/li&gt;
&lt;li&gt;Nearly 1 million records in the largest table&lt;/li&gt;
&lt;li&gt;More than 120,000 database requests per day&lt;/li&gt;
&lt;li&gt;Complex reporting and dashboard requirements&lt;/li&gt;
&lt;li&gt;Transactional workloads with strict consistency requirements&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;While the application continued to function correctly, several reporting and aggregation queries had started taking multiple seconds to execute. As the dataset grew, it became clear that improvements were needed to keep the system responsive and maintainable.&lt;/p&gt;

&lt;p&gt;This article covers the techniques that produced the biggest impact.&lt;/p&gt;

&lt;h2&gt;
  
  
  Understanding the Bottlenecks
&lt;/h2&gt;

&lt;p&gt;The application served multiple tenants from a shared database. Most operations were tenant-scoped and involved data spread across several related tables.&lt;/p&gt;

&lt;p&gt;The primary performance challenges were:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Slow reporting queries&lt;/li&gt;
&lt;li&gt;Repeated aggregation logic&lt;/li&gt;
&lt;li&gt;Increasing dashboard generation time&lt;/li&gt;
&lt;li&gt;Large result sets being fetched unnecessarily&lt;/li&gt;
&lt;li&gt;Excessive application-side processing&lt;/li&gt;
&lt;li&gt;Growing query complexity&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Rather than attempting a complete redesign, the focus was on improving the existing architecture through targeted database optimizations.&lt;/p&gt;




&lt;h2&gt;
  
  
  Composite Indexes Instead of More Indexes
&lt;/h2&gt;

&lt;p&gt;One of the first discoveries was that several frequently executed queries were filtering on multiple columns simultaneously.&lt;/p&gt;

&lt;p&gt;A common mistake is creating separate indexes for every column and expecting PostgreSQL to combine them efficiently via index merging. In practice, composite indexes often provide significantly better results when they match actual query patterns because they allow Postgres to perform a single index scan to satisfy multiple filters.&lt;/p&gt;

&lt;p&gt;For example, instead of indexing columns independently:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- ❌ Suboptimal: Individual index on each column&lt;/span&gt;
&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;INDEX&lt;/span&gt; &lt;span class="n"&gt;idx_events_tenant_id&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;events&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tenant_id&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;INDEX&lt;/span&gt; &lt;span class="n"&gt;idx_events_user_id&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;events&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;user_id&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;INDEX&lt;/span&gt; &lt;span class="n"&gt;idx_events_event_date&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;events&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;event_date&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;we created a single composite index tailored to our access patterns:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- ✅ Optimal: Single composite index matching query filters&lt;/span&gt;
&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;INDEX&lt;/span&gt; &lt;span class="n"&gt;idx_events_tenant_user_date&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;events&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tenant_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;user_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;event_date&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Similarly, transaction-related queries benefited from composite indexes structured like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;INDEX&lt;/span&gt; &lt;span class="n"&gt;idx_transactions_composite&lt;/span&gt; 
&lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;transactions&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tenant_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;transaction_type&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;created_at&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;status_id&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;and scheduling-related operations used patterns such as:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;INDEX&lt;/span&gt; &lt;span class="n"&gt;idx_schedules_composite&lt;/span&gt; 
&lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;schedules&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tenant_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;category_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;week_number&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;day_of_week&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The most important lesson was that &lt;strong&gt;index order matters&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;The leading columns in the index should closely match how the query filters data. Since most queries were tenant-scoped, the tenant identifier typically appeared first. If a query filters on &lt;code&gt;tenant_id&lt;/code&gt; and &lt;code&gt;user_id&lt;/code&gt; but not &lt;code&gt;event_date&lt;/code&gt;, the composite index is still utilized. However, a query filtering only on &lt;code&gt;event_date&lt;/code&gt; would not benefit from this index.&lt;/p&gt;

&lt;p&gt;After reviewing query patterns and introducing composite indexes where appropriate, reporting workloads showed noticeable improvements and several slow queries were eliminated entirely.&lt;/p&gt;




&lt;h2&gt;
  
  
  Moving Business Logic into Supabase RPC Functions
&lt;/h2&gt;

&lt;p&gt;Initially, many operations required multiple database round trips.&lt;/p&gt;

&lt;p&gt;The application would:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Fetch data&lt;/li&gt;
&lt;li&gt;Perform calculations&lt;/li&gt;
&lt;li&gt;Execute additional queries&lt;/li&gt;
&lt;li&gt;Aggregate results&lt;/li&gt;
&lt;li&gt;Return the final response&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This approach worked well during the early stages of development but became increasingly expensive as datasets grew and the latency of consecutive queries compounded.&lt;/p&gt;

&lt;p&gt;To reduce overhead, complex operations were moved into Supabase RPC functions (PostgreSQL PL/pgSQL database functions).&lt;/p&gt;

&lt;p&gt;Instead of multiple client-server interactions:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Application
 ├─ Query A (Fetch tenant orders)
 ├─ Query B (Fetch menu metrics)
 ├─ Query C (Insert audit log)
 └─ Processing (Aggregate in Node.js)

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;the flow became:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Application
 └─ RPC Function (get_tenant_dashboard_metrics)
      ├─ Validation
      ├─ Aggregation
      ├─ Calculations
      └─ Result

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Here is a simplified example of how we defined the database function in Postgres:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;OR&lt;/span&gt; &lt;span class="k"&gt;REPLACE&lt;/span&gt; &lt;span class="k"&gt;FUNCTION&lt;/span&gt; &lt;span class="n"&gt;get_tenant_summary&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;p_tenant_id&lt;/span&gt; &lt;span class="n"&gt;UUID&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;RETURNS&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="n"&gt;total_revenue&lt;/span&gt; &lt;span class="nb"&gt;NUMERIC&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;completed_orders&lt;/span&gt; &lt;span class="nb"&gt;BIGINT&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;LANGUAGE&lt;/span&gt; &lt;span class="n"&gt;plpgsql&lt;/span&gt; &lt;span class="k"&gt;SECURITY&lt;/span&gt; &lt;span class="k"&gt;DEFINER&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="err"&gt;$$&lt;/span&gt;
&lt;span class="k"&gt;BEGIN&lt;/span&gt;
  &lt;span class="k"&gt;RETURN&lt;/span&gt; &lt;span class="n"&gt;QUERY&lt;/span&gt;
  &lt;span class="k"&gt;SELECT&lt;/span&gt; 
    &lt;span class="n"&gt;COALESCE&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;SUM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;amount&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;total_revenue&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;FILTER&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'completed'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;completed_orders&lt;/span&gt;
  &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;transactions&lt;/span&gt;
  &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;tenant_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;p_tenant_id&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;END&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="err"&gt;$$&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We then call this function in a single, fast client-side request:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;error&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;supabase&lt;/span&gt;
  &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;rpc&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;get_tenant_summary&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;p_tenant_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;tenantId&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This reduced network round trips, centralized business logic, and improved execution speed. The biggest performance gains were observed in reporting-related workloads where response times dropped from several seconds to millisecond-level execution.&lt;/p&gt;




&lt;h2&gt;
  
  
  Simplifying Data Access with Database Views
&lt;/h2&gt;

&lt;p&gt;As the application evolved, certain joins and aggregations began appearing repeatedly throughout the codebase. The same datasets were being reconstructed in multiple places.&lt;/p&gt;

&lt;p&gt;To solve this, database views were introduced.&lt;/p&gt;

&lt;p&gt;For example, to build a consolidated daily activity summary, we defined a Postgres View:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;OR&lt;/span&gt; &lt;span class="k"&gt;REPLACE&lt;/span&gt; &lt;span class="k"&gt;VIEW&lt;/span&gt; &lt;span class="n"&gt;daily_activity_summary&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; 
  &lt;span class="n"&gt;tenant_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;DATE_TRUNC&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'day'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;created_at&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;activity_date&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;total_events&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;DISTINCT&lt;/span&gt; &lt;span class="n"&gt;user_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;active_users&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;events&lt;/span&gt;
&lt;span class="k"&gt;GROUP&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;tenant_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;DATE_TRUNC&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'day'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;created_at&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Instead of repeatedly joining numerous tables across different services, the application can now query the view directly, using client-side filters:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;error&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;supabase&lt;/span&gt;
  &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="k"&gt;from&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;daily_activity_summary&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;select&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;activity_date, total_events, active_users&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;eq&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;tenant_id&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;tenantId&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;order&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;activity_date&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;ascending&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Benefits included:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Reduced query duplication&lt;/li&gt;
&lt;li&gt;Cleaner application code&lt;/li&gt;
&lt;li&gt;Easier reporting development&lt;/li&gt;
&lt;li&gt;Improved maintainability&lt;/li&gt;
&lt;li&gt;Faster onboarding for developers&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Views effectively became reusable building blocks for reporting and dashboard generation.&lt;/p&gt;




&lt;h2&gt;
  
  
  Eliminating Unnecessary Data Retrieval
&lt;/h2&gt;

&lt;p&gt;During optimization, several endpoints were found to be retrieving significantly more data than required.&lt;/p&gt;

&lt;h3&gt;
  
  
  Avoiding &lt;code&gt;SELECT *&lt;/code&gt;
&lt;/h3&gt;

&lt;p&gt;Instead of:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;transactions&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;queries were modified to fetch only the required columns:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt;
  &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;status&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;created_at&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;transactions&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;While simple, this optimization reduced network transfer and improved query efficiency across multiple endpoints.&lt;/p&gt;




&lt;h3&gt;
  
  
  Introducing Pagination
&lt;/h3&gt;

&lt;p&gt;Some endpoints attempted to return entire datasets at once. As records increased, response sizes grew unnecessarily.&lt;/p&gt;

&lt;p&gt;Pagination was introduced across reporting and administrative interfaces to ensure only the required records were fetched.&lt;/p&gt;

&lt;p&gt;Instead of returning thousands of rows:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;transactions&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;queries were limited appropriately:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;transactions&lt;/span&gt;
&lt;span class="k"&gt;LIMIT&lt;/span&gt; &lt;span class="mi"&gt;50&lt;/span&gt; &lt;span class="k"&gt;OFFSET&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This significantly reduced response sizes and improved perceived application performance.&lt;/p&gt;




&lt;h3&gt;
  
  
  Removing Unnecessary Sorting
&lt;/h3&gt;

&lt;p&gt;Several queries performed sorting operations (&lt;code&gt;ORDER BY&lt;/code&gt;) that were not actually required by the business logic or layout. Removing unnecessary sorting clauses reduced CPU execution time and lowered database workload, especially on larger datasets.&lt;/p&gt;




&lt;h2&gt;
  
  
  Optimizing Aggregations
&lt;/h2&gt;

&lt;p&gt;Reporting systems often become slow because aggregation logic is repeatedly recalculated.&lt;/p&gt;

&lt;p&gt;By simplifying query logic, moving heavy aggregations to periodic background rollups (or materialized views), and querying via optimized RPC functions, we reduced database complexity and significantly improved dashboard loading speeds.&lt;/p&gt;




&lt;h2&gt;
  
  
  Adopting UUIDv7 for New Tables
&lt;/h2&gt;

&lt;p&gt;As the system evolved, UUIDv7 was adopted for newly created tables.&lt;/p&gt;

&lt;p&gt;Traditional UUIDv4 values are completely random. While they provide excellent uniqueness guarantees, they introduce index fragmentation in B-Tree structures. When a database inserts random values, it forces the B-Tree index to split pages in random locations to fit the keys, lowering write speeds and caching efficiency.&lt;/p&gt;

&lt;p&gt;UUIDv7 solves this by introducing millisecond-level time ordering while maintaining globally unique identifiers.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Conceptual structure of UUIDv7:&lt;/span&gt;
&lt;span class="c1"&gt;-- [48-bit Timestamp] [4-bit Version] [12-bit Rand/Seq] [2-bit Variant] [62-bit Random]&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Benefits include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Better insertion locality:&lt;/strong&gt; Since keys are sequential, Postgres inserts new rows at the end of the B-Tree index structure, minimizing page splits.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;More predictable index growth:&lt;/strong&gt; Better utilization of memory-mapped buffer cache.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Improved long-term scalability:&lt;/strong&gt; Higher write throughput on index pages.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Retained uniqueness guarantees:&lt;/strong&gt; No risk of collisions across tenants.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;To implement this on new tables, we define the column default using a generator:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;new_tenant_log&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="n"&gt;UUID&lt;/span&gt; &lt;span class="k"&gt;PRIMARY&lt;/span&gt; &lt;span class="k"&gt;KEY&lt;/span&gt; &lt;span class="k"&gt;DEFAULT&lt;/span&gt; &lt;span class="n"&gt;uuid_generate_v7&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
  &lt;span class="n"&gt;tenant_id&lt;/span&gt; &lt;span class="n"&gt;UUID&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;log_message&lt;/span&gt; &lt;span class="nb"&gt;TEXT&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;created_at&lt;/span&gt; &lt;span class="n"&gt;TIMESTAMPTZ&lt;/span&gt; &lt;span class="k"&gt;DEFAULT&lt;/span&gt; &lt;span class="n"&gt;NOW&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The migration strategy was intentionally conservative: existing tables remained unchanged while newly created tables adopted UUIDv7 moving forward. Since no formal benchmarking was performed on the UUID transition, no specific performance improvements are claimed. However, UUIDv7 aligns well with modern PostgreSQL best practices for large-scale systems.&lt;/p&gt;




&lt;h2&gt;
  
  
  Key Lessons Learned
&lt;/h2&gt;

&lt;p&gt;Several lessons became clear throughout the optimization process.&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Indexes Are Not a Magic Solution
&lt;/h3&gt;

&lt;p&gt;Adding indexes blindly often increases write overhead while providing little benefit. Indexes should be designed around real query patterns.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Composite Indexes Usually Matter More Than Individual Indexes
&lt;/h3&gt;

&lt;p&gt;Understanding how queries filter data is more important than increasing the total number of indexes.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Database-Side Processing Can Be Extremely Effective
&lt;/h3&gt;

&lt;p&gt;For aggregation-heavy workloads, RPC functions reduced round trips and simplified application logic.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. Views Improve Maintainability
&lt;/h3&gt;

&lt;p&gt;Complex reporting logic becomes significantly easier to manage when reusable views are introduced.&lt;/p&gt;

&lt;h3&gt;
  
  
  5. Pagination Should Be Introduced Early
&lt;/h3&gt;

&lt;p&gt;Large datasets eventually become expensive. Pagination prevents future performance problems.&lt;/p&gt;

&lt;h3&gt;
  
  
  6. Small Optimizations Compound
&lt;/h3&gt;

&lt;p&gt;Removing unnecessary sorting, reducing selected columns, optimizing aggregations, and improving indexes may seem minor individually. Together, they can transform the performance characteristics of a system.&lt;/p&gt;




&lt;h2&gt;
  
  
  Final Thoughts
&lt;/h2&gt;

&lt;p&gt;Database optimization is rarely about finding a single breakthrough improvement. In this case, the largest gains came from systematically addressing multiple bottlenecks:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Composite indexing&lt;/li&gt;
&lt;li&gt;RPC-based processing&lt;/li&gt;
&lt;li&gt;Database views&lt;/li&gt;
&lt;li&gt;Query optimization&lt;/li&gt;
&lt;li&gt;Aggregation improvements&lt;/li&gt;
&lt;li&gt;Pagination strategies&lt;/li&gt;
&lt;li&gt;UUIDv7 adoption&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The result was a more maintainable system, faster reporting workflows, and significantly improved responsiveness under production workloads. Most importantly, these improvements were achieved without redesigning the entire architecture. Instead, they came from understanding how the database was being used and optimizing the paths that mattered most.&lt;/p&gt;

</description>
      <category>database</category>
      <category>performance</category>
      <category>postgres</category>
      <category>sql</category>
    </item>
    <item>
      <title>Building an Atomic Delivery Dispatcher with PostgreSQL Row-Level Locking</title>
      <dc:creator>Poojan</dc:creator>
      <pubDate>Tue, 02 Jun 2026 00:00:00 +0000</pubDate>
      <link>https://dev.to/poojang/building-an-atomic-delivery-dispatcher-with-postgresql-row-level-locking-3ncf</link>
      <guid>https://dev.to/poojang/building-an-atomic-delivery-dispatcher-with-postgresql-row-level-locking-3ncf</guid>
      <description>&lt;p&gt;Building a logistics and last-mile delivery dispatcher involves more than calculating routes or sending SMS updates. In the real world, you are managing scarce physical resources (riders) operating under strict constraints (weight/capacity limits) in real-time.&lt;/p&gt;

&lt;p&gt;When multiple orders enter the system at peak hours, a logistics backend must guarantee that:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Riders are never assigned orders exceeding their physical carrying capacity.&lt;/li&gt;
&lt;li&gt;A rider is never double-booked onto multiple concurrent delivery batches.&lt;/li&gt;
&lt;li&gt;Orders move through a strict, deterministic state progression.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Below is an engineering walkthrough on how we built a concurrency-safe delivery dispatch engine in TypeScript using PostgreSQL row-level locking.&lt;/p&gt;




&lt;h2&gt;
  
  
  1. Concurrency Control: Locking with &lt;code&gt;SELECT ... FOR UPDATE&lt;/code&gt;
&lt;/h2&gt;

&lt;p&gt;When assigning a batch of orders to a rider, the system must check the rider's active load to ensure the additions do not exceed their maximum capacity limit.&lt;/p&gt;

&lt;p&gt;In a standard Node.js environment, querying the database to calculate current load, checking the limits in application memory, and then issuing an &lt;code&gt;UPDATE&lt;/code&gt; leaves a gap for race conditions. If two different order managers attempt to assign batches to the same rider simultaneously, both operations might read the rider's load as "available" before either write completes, resulting in over-allocation.&lt;/p&gt;

&lt;p&gt;To solve this, we use PostgreSQL &lt;strong&gt;row-level locking&lt;/strong&gt; (&lt;code&gt;SELECT ... FOR UPDATE&lt;/code&gt;) inside an explicit transaction block.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;PoolClient&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;pg&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kr"&gt;interface&lt;/span&gt; &lt;span class="nx"&gt;RiderAssignmentResult&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nl"&gt;success&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;boolean&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;message&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;assignRiderToBatch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="nx"&gt;client&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;PoolClient&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="nx"&gt;batchId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="nx"&gt;riderId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;
&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="nb"&gt;Promise&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;RiderAssignmentResult&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;// Start transaction boundary&lt;/span&gt;
    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;query&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;BEGIN&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="c1"&gt;// 1. Lock the rider row for modification&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;riderRes&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;query&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
      &lt;span class="s2"&gt;`SELECT rider_id, availability_status, max_capacity, type 
       FROM delivery_riders 
       WHERE rider_id = $1 
       FOR UPDATE`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;riderId&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;riderRes&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;rowCount&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;RIDER_NOT_FOUND&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;rider&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;riderRes&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;rows&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;

    &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;rider&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;availability_status&lt;/span&gt; &lt;span class="o"&gt;!==&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;AVAILABLE&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;RIDER_UNAVAILABLE&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="c1"&gt;// 2. Fetch order count in the proposed batch&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;batchRes&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;query&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
      &lt;span class="s2"&gt;`SELECT count(*) as order_count, status FROM delivery_batches WHERE batch_id = $1`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;batchId&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;batch&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;batchRes&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;rows&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;incomingOrderCount&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;parseInt&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;batch&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;order_count&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;batch&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;status&lt;/span&gt; &lt;span class="o"&gt;!==&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;CREATED&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;INVALID_BATCH_STATUS&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="c1"&gt;// 3. Calculate rider's current active order load&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;activeLoadRes&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;query&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
      &lt;span class="s2"&gt;`SELECT COALESCE(SUM(order_count), 0) as current_load 
       FROM delivery_batches 
       WHERE assigned_rider_id = $1 AND status IN ('ASSIGNED', 'PICKED', 'OUT_FOR_DELIVERY')`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;riderId&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;currentLoad&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;parseInt&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;activeLoadRes&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;rows&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nx"&gt;current_load&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="c1"&gt;// 4. Validate capacity constraints&lt;/span&gt;
    &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;currentLoad&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="nx"&gt;incomingOrderCount&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;rider&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;max_capacity&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;CAPACITY_EXCEEDED&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="c1"&gt;// 5. Update batch and assign rider&lt;/span&gt;
    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;query&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
      &lt;span class="s2"&gt;`UPDATE delivery_batches 
       SET assigned_rider_id = $1, status = 'ASSIGNED', assigned_at = NOW() 
       WHERE batch_id = $2`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;riderId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;batchId&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="c1"&gt;// 6. Log assignment audit log&lt;/span&gt;
    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;query&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
      &lt;span class="s2"&gt;`INSERT INTO rider_assignment_history (batch_id, new_rider_id, reason) 
       VALUES ($1, $2, $3)`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;batchId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;riderId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;System Auto-Dispatch&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;query&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;COMMIT&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;success&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;message&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Rider assigned successfully&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;};&lt;/span&gt;

  &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;catch &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="na"&gt;error&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;any&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;query&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;ROLLBACK&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;success&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;message&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;error&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;message&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Transaction failed&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;};&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  The Locking Mechanics
&lt;/h3&gt;

&lt;p&gt;When the transaction runs &lt;code&gt;SELECT ... FOR UPDATE&lt;/code&gt; on the target rider, Postgres places an exclusive write lock on that specific row. Any other concurrent transaction attempting to read or update that same rider row is forced to queue until the first transaction calls &lt;code&gt;COMMIT&lt;/code&gt; or &lt;code&gt;ROLLBACK&lt;/code&gt;. This guarantees that capacity limits are computed against deterministic, sequential state.&lt;/p&gt;




&lt;h2&gt;
  
  
  2. Enforcing State Transitions via State Machines
&lt;/h2&gt;

&lt;p&gt;A delivery batch must follow a strict path to prevent "orphaned" orders or invalid logic (like marked delivered before they are picked up).&lt;/p&gt;

&lt;p&gt;The lifecycle follows this state map:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;    [CREATED]
        │
        ▼ (Rider Assigned)
   [ASSIGNED]
        │
        ▼ (Rider Picked Up Order)
    [PICKED]
        │
        ▼ (Transit Started)
[OUT_FOR_DELIVERY]
   ┌────┴────────────────────────┐
   ▼ (Successful Delivery) ▼ (Delivery Failed)
[COMPLETED] [FAILED]

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;To enforce this, we build a transition map in TypeScript that acts as our guardrail:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;type&lt;/span&gt; &lt;span class="nx"&gt;BatchStatus&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;CREATED&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;ASSIGNED&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;PICKED&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;OUT_FOR_DELIVERY&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;COMPLETED&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;FAILED&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;CANCELLED&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;VALID_TRANSITIONS&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;Record&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;BatchStatus&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;BatchStatus&lt;/span&gt;&lt;span class="p"&gt;[]&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;CREATED&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;ASSIGNED&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;CANCELLED&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
  &lt;span class="na"&gt;ASSIGNED&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;PICKED&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;CREATED&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;CANCELLED&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="c1"&gt;// Can reassign back to CREATED&lt;/span&gt;
  &lt;span class="na"&gt;PICKED&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;OUT_FOR_DELIVERY&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;FAILED&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
  &lt;span class="na"&gt;OUT_FOR_DELIVERY&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;COMPLETED&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;FAILED&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
  &lt;span class="na"&gt;COMPLETED&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[],&lt;/span&gt; &lt;span class="c1"&gt;// Terminal State&lt;/span&gt;
  &lt;span class="na"&gt;FAILED&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[],&lt;/span&gt; &lt;span class="c1"&gt;// Terminal State&lt;/span&gt;
  &lt;span class="na"&gt;CANCELLED&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt; &lt;span class="c1"&gt;// Terminal State&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;

&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;validateStateTransition&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;current&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;BatchStatus&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;next&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;BatchStatus&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="nx"&gt;boolean&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;allowed&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;VALID_TRANSITIONS&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;current&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;allowed&lt;/span&gt; &lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="nx"&gt;allowed&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;includes&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;next&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;By verifying transitions before executing any database changes, the engine blocks invalid states at the API boundaries. For example, if a client tries to patch the status of a &lt;code&gt;CREATED&lt;/code&gt; batch to &lt;code&gt;OUT_FOR_DELIVERY&lt;/code&gt;, the application throws a 400 Bad Request error immediately.&lt;/p&gt;




&lt;h2&gt;
  
  
  3. Storing Rich Tracking Data with JSONB Proof of Delivery
&lt;/h2&gt;

&lt;p&gt;When a driver completes a delivery, the app captures rich proof of delivery (PoD): photo URLs (uploaded to S3), coordinate telemetry, customer signatures, notes, and timestamps.&lt;/p&gt;

&lt;p&gt;Instead of writing a rigid relational schema with child tables for tracking metadata, we opted for a Postgres &lt;code&gt;JSONB&lt;/code&gt; column on our batch order mapping table.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;delivery_batch_orders&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;batch_order_id&lt;/span&gt; &lt;span class="n"&gt;UUID&lt;/span&gt; &lt;span class="k"&gt;PRIMARY&lt;/span&gt; &lt;span class="k"&gt;KEY&lt;/span&gt; &lt;span class="k"&gt;DEFAULT&lt;/span&gt; &lt;span class="n"&gt;gen_random_uuid&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
    &lt;span class="n"&gt;batch_id&lt;/span&gt; &lt;span class="n"&gt;UUID&lt;/span&gt; &lt;span class="k"&gt;REFERENCES&lt;/span&gt; &lt;span class="n"&gt;delivery_batches&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;batch_id&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;order_id&lt;/span&gt; &lt;span class="n"&gt;UUID&lt;/span&gt; &lt;span class="k"&gt;REFERENCES&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;order_id&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;sequence&lt;/span&gt; &lt;span class="nb"&gt;INTEGER&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="nb"&gt;VARCHAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;DEFAULT&lt;/span&gt; &lt;span class="s1"&gt;'PENDING'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;

    &lt;span class="c1"&gt;-- JSONB document holding unstructured proof of delivery&lt;/span&gt;
    &lt;span class="n"&gt;proof_of_delivery&lt;/span&gt; &lt;span class="n"&gt;JSONB&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;

    &lt;span class="n"&gt;completed_at&lt;/span&gt; &lt;span class="nb"&gt;TIMESTAMP&lt;/span&gt; &lt;span class="k"&gt;WITH&lt;/span&gt; &lt;span class="nb"&gt;TIME&lt;/span&gt; &lt;span class="k"&gt;ZONE&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Why JSONB instead of structured relational columns?
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Dynamic Metadata:&lt;/strong&gt; Different delivery types require different proofs. A corporate drop-off might require a text note ("Left with receptionist"), while a residential drop-off requires an S3 photo link and GPS bounds. JSONB handles this schema variance without migrations.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Reduced Join Overhead:&lt;/strong&gt; When fetching delivery details for the administration portal, the application reads a single row without executing expensive relational joins to retrieve metadata.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Indexing Support:&lt;/strong&gt; Postgres allows indexing JSONB fields using GIN (Generalized Inverted Index) keys. If we need to search for deliveries verified within specific coordinates, we can index the JSONB path directly:
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;INDEX&lt;/span&gt; &lt;span class="n"&gt;idx_pod_coordinates&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;delivery_batch_orders&lt;/span&gt; &lt;span class="k"&gt;USING&lt;/span&gt; &lt;span class="n"&gt;GIN&lt;/span&gt; &lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="n"&gt;proof_of_delivery&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="s1"&gt;'coordinates'&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  4. Real-Time Telemetry Updates
&lt;/h2&gt;

&lt;p&gt;Delivery is highly interactive. Customers, outlet merchants, and platform dispatchers need to see location changes and status updates instantly.&lt;/p&gt;

&lt;p&gt;To achieve this, we link our database transactions directly with a real-time event broker (Socket.IO):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;Server&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;socket.io&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kr"&gt;interface&lt;/span&gt; &lt;span class="nx"&gt;DeliveryEventPayload&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nl"&gt;batchId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;status&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;BatchStatus&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;outletId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;riderId&lt;/span&gt;&lt;span class="p"&gt;?:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;updatedAt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;DeliveryNotificationService&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;private&lt;/span&gt; &lt;span class="nx"&gt;io&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;Server&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

  &lt;span class="nf"&gt;constructor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;io&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;Server&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;io&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;io&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="nf"&gt;broadcastBatchStatus&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;payload&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;DeliveryEventPayload&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="k"&gt;void&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;// 1. Notify the merchant outlet dashboard&lt;/span&gt;
    &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;io&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;to&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`outlet:&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;payload&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;outletId&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;emit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;delivery_batch_updated&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;payload&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="c1"&gt;// 2. Notify the specific rider's device&lt;/span&gt;
    &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;payload&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;riderId&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;io&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;to&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`rider:&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;payload&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;riderId&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;emit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;my_batch_updated&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;payload&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="c1"&gt;// 3. Notify the customer room tracking this order&lt;/span&gt;
    &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;io&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;to&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`batch:&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;payload&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;batchId&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;emit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;tracking_updated&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;status&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;payload&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;status&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;timestamp&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;payload&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;updatedAt&lt;/span&gt;
    &lt;span class="p"&gt;});&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This pub/sub routing model ensures that the instant database row updates commit successfully, the web socket layer broadcasts the changes to the respective clients in sub-100ms.&lt;/p&gt;




&lt;h2&gt;
  
  
  Summary of Dispatcher Core Patterns
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Serialize Resource Allocation:&lt;/strong&gt; Use row locks (&lt;code&gt;FOR UPDATE&lt;/code&gt;) inside isolated database transactions to prevent capacity overloading.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Defensive State Logic:&lt;/strong&gt; Define a clear, immutable transitions map to enforce linear lifecycle paths on orders.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Leverage Document Fields for Audits:&lt;/strong&gt; Use &lt;code&gt;JSONB&lt;/code&gt; for storing flexible metadata profiles, keeping reads fast and schema migrations minimal.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Link DB Events to Websockets:&lt;/strong&gt; Keep clients synchronized by coupling transactional success hooks directly to your real-time notification brokers.&lt;/li&gt;
&lt;/ol&gt;

</description>
      <category>backend</category>
      <category>database</category>
      <category>postgres</category>
      <category>sql</category>
    </item>
    <item>
      <title>Designing a Multi-Tenant Subsidy Engine with Concurrency-Safe Budget Tracking</title>
      <dc:creator>Poojan</dc:creator>
      <pubDate>Tue, 02 Jun 2026 00:00:00 +0000</pubDate>
      <link>https://dev.to/poojang/designing-a-multi-tenant-subsidy-engine-with-concurrency-safe-budget-tracking-p37</link>
      <guid>https://dev.to/poojang/designing-a-multi-tenant-subsidy-engine-with-concurrency-safe-budget-tracking-p37</guid>
      <description>&lt;p&gt;Expanding a single-tenant business application to support B2B corporate models brings massive complexity to your data access layer. You are no longer just handling independent user checkouts; you are now enforcing complex corporate compliance rules, isolating merchant visibility, managing shared program budgets, and tracking subsidy ledgers.&lt;/p&gt;

&lt;p&gt;As a backend engineer, I recently architected a B2B corporate allowance engine. Below is a deep dive into the core engineering decisions, database design, conflict resolution algorithms, and concurrency patterns required to build a high-integrity multi-tenant corporate wallet.&lt;/p&gt;




&lt;h2&gt;
  
  
  1. Multi-Tenant Relational Isolation
&lt;/h2&gt;

&lt;p&gt;In a multi-tenant B2B platform, security starts at the database schema. Corporate organizations contract with the platform to allow their employees to buy meals from a selected pool of merchants.&lt;/p&gt;

&lt;p&gt;To enforce this, we designed a two-tiered outlet mapping structure:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Platform Level:&lt;/strong&gt; The platform admin assigns a master pool of merchants to the organization.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Office Level:&lt;/strong&gt; The corporate admin maps specific merchants from that pool to individual physical offices.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This is enforced at the database level using composite foreign keys:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- 1. Master Pool assigned to the organization&lt;/span&gt;
&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;corporate_org_outlets&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;organization_id&lt;/span&gt; &lt;span class="n"&gt;UUID&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;outlet_id&lt;/span&gt; &lt;span class="n"&gt;UUID&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;assigned_at&lt;/span&gt; &lt;span class="nb"&gt;TIMESTAMP&lt;/span&gt; &lt;span class="k"&gt;WITH&lt;/span&gt; &lt;span class="nb"&gt;TIME&lt;/span&gt; &lt;span class="k"&gt;ZONE&lt;/span&gt; &lt;span class="k"&gt;DEFAULT&lt;/span&gt; &lt;span class="k"&gt;CURRENT_TIMESTAMP&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="k"&gt;PRIMARY&lt;/span&gt; &lt;span class="k"&gt;KEY&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;organization_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;outlet_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;-- 2. Office level assignment mapping to specific offices&lt;/span&gt;
&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;corporate_office_outlets&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;office_outlet_id&lt;/span&gt; &lt;span class="n"&gt;UUID&lt;/span&gt; &lt;span class="k"&gt;PRIMARY&lt;/span&gt; &lt;span class="k"&gt;KEY&lt;/span&gt; &lt;span class="k"&gt;DEFAULT&lt;/span&gt; &lt;span class="n"&gt;gen_random_uuid&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
    &lt;span class="n"&gt;office_id&lt;/span&gt; &lt;span class="n"&gt;UUID&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;organization_id&lt;/span&gt; &lt;span class="n"&gt;UUID&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;outlet_id&lt;/span&gt; &lt;span class="n"&gt;UUID&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;is_visible_in_app&lt;/span&gt; &lt;span class="nb"&gt;BOOLEAN&lt;/span&gt; &lt;span class="k"&gt;DEFAULT&lt;/span&gt; &lt;span class="k"&gt;TRUE&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;

    &lt;span class="c1"&gt;-- Enforce that the office outlet MUST exist in the organization's master pool&lt;/span&gt;
    &lt;span class="k"&gt;FOREIGN&lt;/span&gt; &lt;span class="k"&gt;KEY&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;organization_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;outlet_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; 
        &lt;span class="k"&gt;REFERENCES&lt;/span&gt; &lt;span class="n"&gt;corporate_org_outlets&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;organization_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;outlet_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="k"&gt;DELETE&lt;/span&gt; &lt;span class="k"&gt;CASCADE&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  The Architectural Benefit
&lt;/h3&gt;

&lt;p&gt;By referencing a composite key &lt;code&gt;(organization_id, outlet_id)&lt;/code&gt; rather than just a single &lt;code&gt;outlet_id&lt;/code&gt;, PostgreSQL prevents a corporate admin from accidentally (or maliciously) displaying an unauthorized merchant to their employees. Data integrity is guaranteed at the database engine level, minimizing reliance on application-layer checks.&lt;/p&gt;




&lt;h2&gt;
  
  
  2. Resolving Subsidy Conflicts: The "Maximum Rule"
&lt;/h2&gt;

&lt;p&gt;When an employee checks out, multiple subsidy programs might apply to their order. For example:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A &lt;strong&gt;Corporate Allowance&lt;/strong&gt; (e.g., 50% subsidy up to $10, eligible for the entire office).&lt;/li&gt;
&lt;li&gt;A &lt;strong&gt;Merchant-Specific Subsidy&lt;/strong&gt; (e.g., a flat $5 discount sponsored for a specific team).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Allowing these programs to stack cumulatively (e.g., applying both) leads to extreme financial leakage where orders can become free or negative-cost.&lt;/p&gt;

&lt;p&gt;To solve this, we implemented the &lt;strong&gt;Maximum Rule&lt;/strong&gt; : evaluate all eligible programs, calculate the absolute discount for each independently, and apply &lt;em&gt;only&lt;/em&gt; the program that grants the user the highest discount.&lt;/p&gt;

&lt;p&gt;Here is a simplified version of the logic executing in the order validation lifecycle:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kr"&gt;interface&lt;/span&gt; &lt;span class="nx"&gt;SubsidyProgram&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nl"&gt;programId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;PERCENTAGE&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;FIXED&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;PERCENTAGE_CAPPED&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;value&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;maxCap&lt;/span&gt;&lt;span class="p"&gt;?:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;calculateDiscount&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;orderAmount&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;program&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;SubsidyProgram&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;switch &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;program&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="kd"&gt;type&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;PERCENTAGE&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
      &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;orderAmount&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;program&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;value&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;PERCENTAGE_CAPPED&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
      &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;percentageDiscount&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;orderAmount&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;program&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;value&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
      &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nb"&gt;Math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;min&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;percentageDiscount&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;program&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;maxCap&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;FIXED&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
      &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nb"&gt;Math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;min&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;program&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;value&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;orderAmount&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="nl"&gt;default&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
      &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;getBestEligibleSubsidy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="nx"&gt;orderAmount&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
  &lt;span class="nx"&gt;eligiblePrograms&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;SubsidyProgram&lt;/span&gt;&lt;span class="p"&gt;[]&lt;/span&gt;
&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nl"&gt;programId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="nl"&gt;discountAmount&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;eligiblePrograms&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;length&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;eligiblePrograms&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;reduce&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nx"&gt;best&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;current&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;currentDiscount&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;calculateDiscount&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;orderAmount&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;current&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;best&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="nx"&gt;currentDiscount&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;best&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;discountAmount&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;programId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;current&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;programId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;discountAmount&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;currentDiscount&lt;/span&gt; &lt;span class="p"&gt;};&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;best&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;programId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="nl"&gt;discountAmount&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Why Reduce?
&lt;/h3&gt;

&lt;p&gt;The &lt;code&gt;Array.prototype.reduce&lt;/code&gt; pattern executes in linear &lt;code&gt;O(N)&lt;/code&gt; time complexity. Because the list of eligible programs for a single employee is typically small (under 10), this keeps calculation overhead virtually non-existent while guaranteeing deterministic, leak-proof checkouts.&lt;/p&gt;




&lt;h2&gt;
  
  
  3. Concurrency-Safe Budget Exhaustion
&lt;/h2&gt;

&lt;p&gt;In corporate subsidy programs, organizations set strict limits (e.g., a total monthly budget of $5,000). Once the budget is exhausted, the program must immediately disable itself, and employees must pay full price.&lt;/p&gt;

&lt;p&gt;In a high-concurrency environment where dozens of users check out at the exact same second, a standard &lt;strong&gt;Select-then-Update&lt;/strong&gt; application-layer loop is highly vulnerable to race conditions:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;1. User A Reads: budget_used = 4995. Limit = 5000. (Eligible!)
2. User B Reads: budget_used = 4995. Limit = 5000. (Eligible!)
3. User A Updates: budget_used = 4995 + 10 = 5005. (Overdraft!)
4. User B Updates: budget_used = 5005 + 10 = 5015. (Double Overdraft!)

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  The Solution: Database-Level Atomic Updates
&lt;/h3&gt;

&lt;p&gt;Instead of validating budget state inside Node.js, we offloaded the validation to an &lt;strong&gt;atomic transaction block&lt;/strong&gt; with conditional update locks in PostgreSQL.&lt;/p&gt;

&lt;p&gt;We write a single, atomic SQL statement that increments the budget and checks the boundary condition in one step:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Atomic Check-and-Apply Query&lt;/span&gt;
&lt;span class="k"&gt;UPDATE&lt;/span&gt; &lt;span class="n"&gt;corporate_subsidies&lt;/span&gt; 
&lt;span class="k"&gt;SET&lt;/span&gt; &lt;span class="n"&gt;budget_used&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;budget_used&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="err"&gt;$&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;subsidy_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="err"&gt;$&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt; 
  &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;is_active&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;TRUE&lt;/span&gt;
  &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;budget_used&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="err"&gt;$&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&lt;/span&gt; &lt;span class="n"&gt;total_budget&lt;/span&gt;
&lt;span class="n"&gt;RETURNING&lt;/span&gt; &lt;span class="n"&gt;subsidy_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;budget_used&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  How the Application Handles It
&lt;/h3&gt;

&lt;p&gt;If the update affects exactly &lt;code&gt;0&lt;/code&gt; rows, the database tells our application that the budget limit has been reached or the program is inactive. Node.js can catch this state and fail the transaction gracefully, prompting the user that the subsidy is no longer available:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;applySubsidyTransaction&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;client&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;subsidyId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;discountAmount&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;query&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;`
    UPDATE corporate_subsidies 
    SET budget_used = budget_used + $1
    WHERE subsidy_id = $2 
      AND is_active = TRUE
      AND (budget_used + $1) &amp;lt;= total_budget
    RETURNING subsidy_id, budget_used;
  `&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;query&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;query&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;discountAmount&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;subsidyId&lt;/span&gt;&lt;span class="p"&gt;]);&lt;/span&gt;

  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;rowCount&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;SUBSIDY_BUDGET_EXHAUSTED&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;rows&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;By leveraging PostgreSQL's row locks on update execution, database transactions are serialized cleanly. We completely eliminate race conditions and budget overruns without introducing the latency or complexity of distributed lock managers like Redis Redlock.&lt;/p&gt;




&lt;h2&gt;
  
  
  4. Audit Trail Integrity: The Subsidy Ledger
&lt;/h2&gt;

&lt;p&gt;Subsidies are financial transactions. In corporate systems, admins can edit or delete subsidy programs at any time. If you only store references to program IDs, generating historical GST reports, settlement invoices, or user ledgers will yield incorrect values if program parameters have changed.&lt;/p&gt;

&lt;p&gt;To solve this, we built a &lt;strong&gt;Subsidy Ledger&lt;/strong&gt; containing full denormalized snapshots:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;corporate_subsidy_ledger&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;ledger_id&lt;/span&gt; &lt;span class="n"&gt;UUID&lt;/span&gt; &lt;span class="k"&gt;PRIMARY&lt;/span&gt; &lt;span class="k"&gt;KEY&lt;/span&gt; &lt;span class="k"&gt;DEFAULT&lt;/span&gt; &lt;span class="n"&gt;gen_random_uuid&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
    &lt;span class="n"&gt;order_id&lt;/span&gt; &lt;span class="n"&gt;UUID&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt; &lt;span class="k"&gt;REFERENCES&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;order_id&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;employee_id&lt;/span&gt; &lt;span class="n"&gt;UUID&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;subsidy_program_id&lt;/span&gt; &lt;span class="n"&gt;UUID&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;applied_amount&lt;/span&gt; &lt;span class="nb"&gt;NUMERIC&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;

    &lt;span class="c1"&gt;-- Snapshot data for historical reporting&lt;/span&gt;
    &lt;span class="n"&gt;program_snapshot&lt;/span&gt; &lt;span class="n"&gt;JSONB&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
    &lt;span class="n"&gt;created_at&lt;/span&gt; &lt;span class="nb"&gt;TIMESTAMP&lt;/span&gt; &lt;span class="k"&gt;WITH&lt;/span&gt; &lt;span class="nb"&gt;TIME&lt;/span&gt; &lt;span class="k"&gt;ZONE&lt;/span&gt; &lt;span class="k"&gt;DEFAULT&lt;/span&gt; &lt;span class="k"&gt;CURRENT_TIMESTAMP&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Whenever an order is processed:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;We calculate the discount using the active program state.&lt;/li&gt;
&lt;li&gt;We record the calculation details and save a stringified snapshot of the parameters (the percentage, cap, limits, and type) directly inside the JSONB column.&lt;/li&gt;
&lt;li&gt;This creates a permanent, immutable ledger entry. Even if an admin updates the program's parameters or deletes the program entirely next month, historical reports will remain 100% accurate and audit-safe.&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  Summary of Architectural Best Practices
&lt;/h2&gt;

&lt;p&gt;When building financial or B2B entitlement layers:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Push Integrity to the Database:&lt;/strong&gt; Enforce relational pools via composite foreign keys and composite constraints.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;De-stack Early:&lt;/strong&gt; Implement a deterministic resolution algorithm like the Maximum Rule to avoid cumulative discounts.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Enforce Atomic Bounds:&lt;/strong&gt; Never validate business thresholds in application code. Let SQL updates handle calculations and assertions simultaneously.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Log Immutable State:&lt;/strong&gt; Store point-in-time configuration snapshots alongside transactional tables to maintain audit compliance.&lt;/li&gt;
&lt;/ol&gt;

</description>
      <category>architecture</category>
      <category>backend</category>
      <category>saas</category>
      <category>systemdesign</category>
    </item>
    <item>
      <title>How I Cut AWS Costs by 60% by Fixing Infrastructure Inefficiencies</title>
      <dc:creator>Poojan</dc:creator>
      <pubDate>Sun, 01 Mar 2026 00:00:00 +0000</pubDate>
      <link>https://dev.to/poojang/how-i-cut-aws-costs-by-60-by-fixing-infrastructure-inefficiencies-35cg</link>
      <guid>https://dev.to/poojang/how-i-cut-aws-costs-by-60-by-fixing-infrastructure-inefficiencies-35cg</guid>
      <description>&lt;p&gt;When I became the sole backend and infra owner at MealPe, the AWS bill was higher than it needed to be. Not because of over-provisioning - but because of accumulated inefficiencies nobody had cleaned up.&lt;/p&gt;

&lt;p&gt;I didn't redesign the entire system. Instead, I analyzed the bill, audited the runtime patterns, and fixed what was actually broken. Here's exactly how we accomplished a &lt;strong&gt;60% reduction in monthly cloud spend&lt;/strong&gt; while actually improving LCP and request latencies.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Audit: Where Was the Money Going?
&lt;/h2&gt;

&lt;p&gt;Before making any changes, I analyzed our AWS Billing console line-by-line. The monthly distribution revealed that compute overhead and log retention were dramatically out of proportion with our active user base:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Service Category&lt;/th&gt;
&lt;th&gt;Monthly Cost (Before)&lt;/th&gt;
&lt;th&gt;Monthly Cost (After)&lt;/th&gt;
&lt;th&gt;% Reduction&lt;/th&gt;
&lt;th&gt;Primary Culprit&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;EC2 Compute&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;$620&lt;/td&gt;
&lt;td&gt;$280&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;54%&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Blocked event loop, static serving CPU overhead&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;S3 Storage &amp;amp; Transfer&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;$380&lt;/td&gt;
&lt;td&gt;$90&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;76%&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Indefinite log storage, massive image uploads&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;CloudWatch / I/O&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;$250&lt;/td&gt;
&lt;td&gt;$110&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;56%&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Verbose application request/response dumps&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Total&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;$1,250&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;$480&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;61%&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Overall Savings: $770/month&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  1. Fixing Excessive Logging Inefficiencies
&lt;/h2&gt;

&lt;p&gt;Our application log files were growing at an unsustainable rate. The Node.js application was generating verbose logs for every inbound request—including full request and response bodies, heavy SQL trace logs, and absolute stack dumps on trivial input errors.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// ❌ Deprecated verbose pattern that flooded CloudWatch and S3&lt;/span&gt;
&lt;span class="nx"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;use&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;next&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;logger&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;info&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;url&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;url&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;body&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;body&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="c1"&gt;// Large raw JSON buffers&lt;/span&gt;
    &lt;span class="na"&gt;timestamp&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;
  &lt;span class="nf"&gt;next&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;These logs were being written to disk and continuously shipped to Amazon S3.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Fix
&lt;/h3&gt;

&lt;p&gt;I audited our log configuration and implemented &lt;strong&gt;Pino&lt;/strong&gt; for structured JSON logging. I restricted verbose request/response logging only to non-production environments and defined clear log levels for our production cluster:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;DEBUG/INFO:&lt;/strong&gt; Used only for crucial system boot details and structured API transaction summaries (method, path, status, duration).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;WARN/ERROR:&lt;/strong&gt; Reserved for true exception states, database connection alerts, and 500-level codes.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This change alone decreased log volume from &lt;strong&gt;~15GB/day to under 800MB/day&lt;/strong&gt; , dramatically dropping both write I/O costs and file shipping overhead.&lt;/p&gt;




&lt;h2&gt;
  
  
  2. Setting Up Automated S3 Lifecycle Policies
&lt;/h2&gt;

&lt;p&gt;S3 storage is inexpensive until files accumulate over months without rules. We were storing every daily server log, PDF bill copy, and redundant image backup indefinitely. There was no cleanup process in place.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Fix
&lt;/h3&gt;

&lt;p&gt;I configured strict S3 Lifecycle Rules on our logging and static resource buckets:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Daily Server Logs:&lt;/strong&gt; Transition to &lt;strong&gt;S3 Standard-IA (Infrequent Access)&lt;/strong&gt; after 14 days, move to &lt;strong&gt;S3 Glacier Flexible Retrieval&lt;/strong&gt; after 30 days, and permanently expire after 90 days.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Temporary File Uploads:&lt;/strong&gt; Automatically expire from the &lt;code&gt;/tmp/&lt;/code&gt; bucket prefix after 7 days.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Implementing these automated lifecycle states trimmed down our total active S3 storage footprint by &lt;strong&gt;over 70%&lt;/strong&gt; in the first billing cycle.&lt;/p&gt;




&lt;h2&gt;
  
  
  3. Offloading Static Asset Serving from Node.js to NGINX
&lt;/h2&gt;

&lt;p&gt;This was the most impactful architectural refinement. The Node.js backend process was serving static frontend assets directly. Whenever a client loaded our web app, the request for a Javascript chunk, CSS bundle, SVG logo, or favicon went straight through our Express application.&lt;/p&gt;

&lt;p&gt;Because Node.js is single-threaded, serving static file assets blocks the event loop from doing actual API computations, leading to high CPU usage and queuing lag.&lt;/p&gt;

&lt;h3&gt;
  
  
  The NGINX Solution
&lt;/h3&gt;

&lt;p&gt;I configured &lt;strong&gt;NGINX&lt;/strong&gt; as a reverse proxy in front of Node.js. NGINX is built from the ground up to serve static files with extremely low memory and CPU overhead. By handling static requests at the NGINX layer, we bypass the Node.js event loop completely:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;                         ┌──────────────────────┐
   Web Client ───────▶ │ NGINX Reverse Proxy │
                         └──────────┬───────────┘
                    /static/* │ /api/*
              (CSS, JS, images) │ (REST endpoints)
                    ┌───────────────┴───────────────┐
                    ▼ ▼
        ┌─────────────────────┐ ┌────────────────────────┐
        │ Local Static Files │ │ Node.js / PM2 Cluster │
        └─────────────────────┘ └────────────┬───────────┘
                                                      ▼
                                          ┌────────────────────────┐
                                          │ PostgreSQL Database │
                                          └────────────────────────┘

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Configured NGINX Routing Block
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight nginx"&gt;&lt;code&gt;&lt;span class="c1"&gt;# NGINX config: serve static files directly, proxy only API requests&lt;/span&gt;
&lt;span class="k"&gt;server&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kn"&gt;listen&lt;/span&gt; &lt;span class="mi"&gt;443&lt;/span&gt; &lt;span class="s"&gt;ssl&lt;/span&gt; &lt;span class="s"&gt;http2&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="kn"&gt;server_name&lt;/span&gt; &lt;span class="s"&gt;app.mealpe.in&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="c1"&gt;# Static assets served directly by NGINX - never blocks Node.js loop&lt;/span&gt;
    &lt;span class="kn"&gt;location&lt;/span&gt; &lt;span class="n"&gt;/static/&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="kn"&gt;root&lt;/span&gt; &lt;span class="n"&gt;/var/www/mealpe-frontend&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="kn"&gt;expires&lt;/span&gt; &lt;span class="s"&gt;30d&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="kn"&gt;add_header&lt;/span&gt; &lt;span class="s"&gt;Cache-Control&lt;/span&gt; &lt;span class="s"&gt;"public,&lt;/span&gt; &lt;span class="s"&gt;no-transform,&lt;/span&gt; &lt;span class="s"&gt;immutable"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="kn"&gt;access_log&lt;/span&gt; &lt;span class="no"&gt;off&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="c1"&gt;# Turn off access logging for static assets to reduce disk I/O&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="c1"&gt;# API requests proxied to the PM2 Node.js server cluster&lt;/span&gt;
    &lt;span class="kn"&gt;location&lt;/span&gt; &lt;span class="n"&gt;/api/&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="kn"&gt;proxy_pass&lt;/span&gt; &lt;span class="s"&gt;http://127.0.0.1:3000&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="kn"&gt;proxy_http_version&lt;/span&gt; &lt;span class="mf"&gt;1.1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="kn"&gt;proxy_set_header&lt;/span&gt; &lt;span class="s"&gt;Connection&lt;/span&gt; &lt;span class="s"&gt;""&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="kn"&gt;proxy_set_header&lt;/span&gt; &lt;span class="s"&gt;Host&lt;/span&gt; &lt;span class="nv"&gt;$host&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="kn"&gt;proxy_set_header&lt;/span&gt; &lt;span class="s"&gt;X-Real-IP&lt;/span&gt; &lt;span class="nv"&gt;$remote_addr&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="kn"&gt;proxy_set_header&lt;/span&gt; &lt;span class="s"&gt;X-Forwarded-For&lt;/span&gt; &lt;span class="nv"&gt;$proxy_add_x_forwarded_for&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="kn"&gt;proxy_set_header&lt;/span&gt; &lt;span class="s"&gt;X-Forwarded-Proto&lt;/span&gt; &lt;span class="nv"&gt;$scheme&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Offloading this work to NGINX decreased our average EC2 instance memory consumption by &lt;strong&gt;35%&lt;/strong&gt; and CPU usage by &lt;strong&gt;40%&lt;/strong&gt;. This allowed us to safely downsize our EC2 compute tier, reducing our monthly server cost significantly.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Verdict
&lt;/h2&gt;

&lt;p&gt;Optimizing infrastructure isn't always about moving to serverless or re-architecting your entire system. Usually, the highest return on investment comes from cleaning up operational inefficiencies:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Logging:&lt;/strong&gt; Keep production logs actionable, structured, and short.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Storage:&lt;/strong&gt; Enforce S3 lifecycle rules from day one.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Separation of Concerns:&lt;/strong&gt; Let NGINX serve your files, and let Node.js compute your API logic.&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>aws</category>
      <category>cloud</category>
      <category>infrastructure</category>
      <category>performance</category>
    </item>
    <item>
      <title>Scaling a Food-Tech SaaS from 1K to 20K Users</title>
      <dc:creator>Poojan</dc:creator>
      <pubDate>Thu, 01 Jan 2026 00:00:00 +0000</pubDate>
      <link>https://dev.to/poojang/scaling-a-food-tech-saas-from-1k-to-20k-users-3c7k</link>
      <guid>https://dev.to/poojang/scaling-a-food-tech-saas-from-1k-to-20k-users-3c7k</guid>
      <description>&lt;p&gt;When I took over the backend at MealPe, the platform had around 1,000 registered users across a handful of client sites. A year later, it's at 20,000+ users processing 50,000 meals per month. This is what that scaling journey actually looked like - not the clean version, but the real one.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Early Cracks
&lt;/h2&gt;

&lt;p&gt;At 1,000 users, most things "worked." But as we onboarded more institutional clients, cracks started showing.&lt;/p&gt;

&lt;p&gt;API response times crept up. Database queries that ran fine with 10,000 rows started timing out at 200,000. Concurrent meal orders during our intense lunch rush (11:30 AM – 1:00 PM) caused request queuing and database locks that slowed the entire site to a crawl.&lt;/p&gt;




&lt;h2&gt;
  
  
  1. Database-First Optimization
&lt;/h2&gt;

&lt;p&gt;Almost every scaling problem traced back to the database. When I audited the system, I focused on three immediate interventions:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Indexing Strategy Overhaul:&lt;/strong&gt; I audited every query using &lt;code&gt;EXPLAIN ANALYZE&lt;/code&gt; and added composite indexes for the most common access patterns (such as combining &lt;code&gt;venue_id&lt;/code&gt; with &lt;code&gt;order_status&lt;/code&gt; and &lt;code&gt;delivery_time&lt;/code&gt;). This alone dropped average query time from &lt;strong&gt;800ms to under 50ms&lt;/strong&gt; for the order listing endpoints.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Schema Normalization:&lt;/strong&gt; Some tables had been designed with flexibility in mind (utilizing JSON columns for dynamically defined fields). I migrated these to proper relational columns with appropriate types and foreign key constraints, which dramatically cut down CPU evaluation costs in query filters.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Connection Pooling:&lt;/strong&gt; We were creating new database connections per request, creating heavy TCP handshake overhead. Implementing proper connection pooling using &lt;code&gt;pg-pool&lt;/code&gt; reduced connection overhead and kept connection saturation safely below thresholds.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  2. API Architecture Refinement
&lt;/h2&gt;

&lt;p&gt;The original API had grown organically - lots of endpoints doing too many things. I refactored our core router with a focus on:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Pagination Everywhere:&lt;/strong&gt; Enforced strict query pagination using limit/offset and cursors so that list views never try to render thousands of rows in a single DOM draw.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Response Shaping:&lt;/strong&gt; Different clients needed different data structures. Instead of sending full database models, I implemented lightweight view DTOs to shrink our network payload size.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Caching Hot Data:&lt;/strong&gt; Venue menus change once a week, not once a second. Adding a cache layer in front of dynamic menus eliminated thousands of database reads per hour.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  3. The Lunch Rush Problem
&lt;/h2&gt;

&lt;p&gt;Institutional cafeterias have extremely predictable and highly congested usage patterns. Unlike standard food delivery, &lt;strong&gt;80% of all meal orders come in during a tightly bounded 90-minute lunch window&lt;/strong&gt;. This meant our servers needed to handle peak loads that were 15x the daily average.&lt;/p&gt;

&lt;p&gt;To survive these spikes without scaling our cloud costs infinitely:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Decoupling Non-Critical Work:&lt;/strong&gt; I offloaded tasks like push notifications, admin email alerts, and analytic log compilations into a background queue processed after peak hours.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Pre-Computing Availability:&lt;/strong&gt; Instead of scanning ordering records to determine inventory limits on every checkout, we pre-computed meal item quotas every hour, converting an O(N) scan into an O(1) cache read.&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  Monitoring That Actually Matters
&lt;/h2&gt;

&lt;p&gt;I set up monitoring dashboards that tracked metrics predictive of failures rather than just CPU alerts:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;API p95 Latency:&lt;/strong&gt; Warns us when p95 responses exceed 200ms.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Connection Pool Saturation:&lt;/strong&gt; Alerts us when available database pools drop below 15%.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Job Queue Depth:&lt;/strong&gt; Helps us spot background processing lag.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  What I'd Do Differently Next Time
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Plan the Indexing Early:&lt;/strong&gt; Invest in database indexes from day one rather than retrofitting them under pressure.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Bake in Caching First:&lt;/strong&gt; Integrate cache and pagination middlewares into the router framework rather than adding them as custom interventions.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Realistic Load Testing:&lt;/strong&gt; Build synthetic load tests modeling the concentrated lunch rush early in staging.&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>architecture</category>
      <category>backend</category>
      <category>performance</category>
      <category>saas</category>
    </item>
    <item>
      <title>Node.js Performance Optimization: From Slow APIs to Scalable Systems</title>
      <dc:creator>Poojan</dc:creator>
      <pubDate>Mon, 01 Dec 2025 00:00:00 +0000</pubDate>
      <link>https://dev.to/poojang/nodejs-performance-optimization-from-slow-apis-to-scalable-systems-53ao</link>
      <guid>https://dev.to/poojang/nodejs-performance-optimization-from-slow-apis-to-scalable-systems-53ao</guid>
      <description>&lt;p&gt;Node.js is incredibly fast out of the box because of its non-blocking I/O model. However, as your transaction volume grows, minor engineering shortcuts turn into devastating performance bottlenecks.&lt;/p&gt;

&lt;p&gt;At MealPe, as the sole backend engineer, I faced the challenge of optimizing our backend to scale from a few dozen concurrent transactions to thousands. Here is a practical, evidence-based guide to identifying Node.js performance bottlenecks and resolving them using production-proven patterns.&lt;/p&gt;




&lt;h2&gt;
  
  
  Performance Benchmarks: The Impact of Optimization
&lt;/h2&gt;

&lt;p&gt;To demonstrate the concrete impact of these optimizations, I ran load tests using &lt;strong&gt;Autocannon&lt;/strong&gt; on a single t3.medium EC2 instance. The test simulated &lt;strong&gt;100 concurrent users&lt;/strong&gt; making continuous request rounds to a dynamic database endpoint over 10 seconds:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Stage of Optimization&lt;/th&gt;
&lt;th&gt;Avg Latency (p95)&lt;/th&gt;
&lt;th&gt;Throughput (Req/Sec)&lt;/th&gt;
&lt;th&gt;CPU Utilization&lt;/th&gt;
&lt;th&gt;Error Rate&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Baseline (Single thread, no cache)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;420 ms&lt;/td&gt;
&lt;td&gt;240 req/sec&lt;/td&gt;
&lt;td&gt;95% (bottleneck)&lt;/td&gt;
&lt;td&gt;4.2%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Step 1: PM2 Clustered (4 Workers)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;110 ms&lt;/td&gt;
&lt;td&gt;880 req/sec&lt;/td&gt;
&lt;td&gt;42% (distributed)&lt;/td&gt;
&lt;td&gt;0.0%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Step 2: Redis Caching Layer Active&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;18 ms&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;4,200 req/sec&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;15% (idle compute)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;0.0%&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  1. Keep the Event Loop Clear (Non-Blocking IO)
&lt;/h2&gt;

&lt;p&gt;Node.js executes JavaScript code in a single-threaded loop. If you block that thread with CPU-intensive tasks—such as image resizing, parsing massive JSON files, or executing synchronous cryptos—every other user request is forced to wait in a queue.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// ❌ Deprecated synchronous blocking pattern&lt;/span&gt;
&lt;span class="nx"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;/api/report&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;fs&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;readFileSync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;./large-report.csv&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;utf8&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="c1"&gt;// Blocks event loop!&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;processed&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;heavySyncParsing&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="c1"&gt;// Blocks event loop!&lt;/span&gt;
  &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;processed&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  The Fix
&lt;/h3&gt;

&lt;p&gt;Always use asynchronous, non-blocking file and network operations. If you absolutely must perform CPU-bound tasks, offload them to a separate thread using &lt;strong&gt;Worker Threads&lt;/strong&gt; or queue them in a background job system like &lt;strong&gt;BullMQ&lt;/strong&gt; :&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// ✅ Production-ready non-blocking worker pattern&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;Worker&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;require&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;worker_threads&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="nx"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;/api/report&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;worker&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Worker&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;./report-worker.js&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;workerData&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;filePath&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;./large-report.csv&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;

  &lt;span class="nx"&gt;worker&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;on&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;message&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;processedData&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;processedData&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;

  &lt;span class="nx"&gt;worker&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;on&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;error&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;err&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;status&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;500&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;error&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Failed to compile report&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  2. Leverage Multi-Core Scalability with Clustering
&lt;/h2&gt;

&lt;p&gt;A standard Node.js process runs on a single CPU core. In production environments where your cloud instances have multiple cores, a single thread leaves substantial compute capacity completely idle.&lt;/p&gt;

&lt;h3&gt;
  
  
  The PM2 Cluster Approach
&lt;/h3&gt;

&lt;p&gt;Instead of manually managing this with the native &lt;code&gt;cluster&lt;/code&gt; module, use &lt;strong&gt;PM2 cluster mode&lt;/strong&gt; in production. PM2 handles process lifecycle management and automatic load-balancing between child processes out of the box:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="err"&gt;//&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;ecosystem.config.json&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"apps"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"name"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"mealpe-api"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"script"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"./dist/server.js"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"instances"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"max"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;//&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Automatically&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;scales&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;to&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;all&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;available&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;logical&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;cores&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"exec_mode"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"cluster"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"env_production"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"NODE_ENV"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"production"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;

&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;By deploying in cluster mode, we instantly multiplied our request capacity by &lt;strong&gt;4x&lt;/strong&gt; on our quad-core database proxy server without changing a single line of API routing logic.&lt;/p&gt;




&lt;h2&gt;
  
  
  3. Implement Redis Caching for Read-Heavy Routes
&lt;/h2&gt;

&lt;p&gt;The fastest API request is the one that never touches your primary relational database. Database I/O is almost always the p95 latency bottleneck.&lt;/p&gt;

&lt;p&gt;At MealPe, venue menus and outlet statuses change infrequently but are queried on every single page load. Caching these responses in &lt;strong&gt;Redis&lt;/strong&gt; (an ultra-fast, in-memory store) reduces response times down to sub-10 milliseconds.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// ✅ Polished API response caching helper&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;redis&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;require&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;redis&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;redis&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;createClient&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;url&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;REDIS_URL&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;getCachedMenu&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;next&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;venueId&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;params&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

  &lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;cachedData&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`menu:&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;venueId&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;cachedData&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;setHeader&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;X-Cache&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;HIT&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
      &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;parse&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;cachedData&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="c1"&gt;// Cache miss - pass to database controller&lt;/span&gt;
    &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;setHeader&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;X-Cache&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;MISS&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="nf"&gt;next&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;catch &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;err&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;// Fallback gracefully on cache error - don't crash client&lt;/span&gt;
    &lt;span class="nf"&gt;next&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  4. Query Optimization &amp;amp; Indexing
&lt;/h2&gt;

&lt;p&gt;If your database requests are slow, Node.js will be slow. I audited our queries using &lt;code&gt;EXPLAIN ANALYZE&lt;/code&gt; in PostgreSQL and addressed the following:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Missing Indexes:&lt;/strong&gt; We added indexes on foreign keys and columns used in filtering/sorting (e.g., &lt;code&gt;venue_id&lt;/code&gt;, &lt;code&gt;created_at&lt;/code&gt;).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;N+1 Query Elimination:&lt;/strong&gt; Replaced recursive database calls inside loops with clean SQL &lt;code&gt;JOIN&lt;/code&gt; statements or Sequelize eager loading (&lt;code&gt;include&lt;/code&gt;).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Strict Pagination:&lt;/strong&gt; Never load whole tables into memory. Enforce strict &lt;code&gt;LIMIT&lt;/code&gt; and &lt;code&gt;OFFSET&lt;/code&gt; (or cursor-based keys) on all listing tables.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  5. Enable Compression
&lt;/h2&gt;

&lt;p&gt;Ensure &lt;strong&gt;Gzip&lt;/strong&gt; or &lt;strong&gt;Brotli&lt;/strong&gt; compression is active on all JSON and text payloads. This reduces payload transfer sizes by up to &lt;strong&gt;70%&lt;/strong&gt; , speeding up rendering times on mobile devices operating on congested cellular networks:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;compression&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;require&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;compression&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;express&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;require&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;express&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;app&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;express&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

&lt;span class="c1"&gt;// Compresses all outgoing response payloads automatically&lt;/span&gt;
&lt;span class="nx"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;use&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;compression&lt;/span&gt;&lt;span class="p"&gt;());&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Summary of Action Items
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Profile Continuously:&lt;/strong&gt; Use &lt;code&gt;clinic.js&lt;/code&gt; or Chrome DevTools to locate memory leaks and CPU blocks under test loads.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cluster by Default:&lt;/strong&gt; Scale to multiple cores using PM2 cluster mode.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cache Strategically:&lt;/strong&gt; Cache read-heavy API responses using Redis with realistic TTLs.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Tune the Database:&lt;/strong&gt; Set index paths on Postgres and prevent large scans.&lt;/li&gt;
&lt;/ol&gt;

</description>
      <category>backend</category>
      <category>javascript</category>
      <category>node</category>
      <category>performance</category>
    </item>
    <item>
      <title>Zero-Downtime Deployments with PM2 and NGINX</title>
      <dc:creator>Poojan</dc:creator>
      <pubDate>Sat, 01 Nov 2025 00:00:00 +0000</pubDate>
      <link>https://dev.to/poojang/zero-downtime-deployments-with-pm2-and-nginx-j8c</link>
      <guid>https://dev.to/poojang/zero-downtime-deployments-with-pm2-and-nginx-j8c</guid>
      <description>&lt;p&gt;When your platform processes meal orders across busy institutional cafeterias, downtime isn't just inconvenient - people don't get fed. High availability is a hard operational requirement, not just a product target.&lt;/p&gt;

&lt;p&gt;The original deployment process was: SSH into the server, &lt;code&gt;git pull&lt;/code&gt;, &lt;code&gt;npm install&lt;/code&gt;, and &lt;code&gt;pm2 restart all&lt;/code&gt;. This caused a &lt;strong&gt;10–30 second outage per deploy&lt;/strong&gt; , during which our API dropped orders and hung terminals.&lt;/p&gt;

&lt;p&gt;Here is exactly how I set up graceful, zero-downtime deployments for our Node.js server cluster using PM2's native clustering and NGINX upstream proxies.&lt;/p&gt;




&lt;h2&gt;
  
  
  1. PM2 Cluster Configuration
&lt;/h2&gt;

&lt;p&gt;The foundation of zero-downtime is simple: never let the number of active app instances drop to zero. PM2's cluster mode makes it straightforward to scale your process across all cores:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// ecosystem.config.js&lt;/span&gt;
&lt;span class="nx"&gt;module&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;exports&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;apps&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[{&lt;/span&gt;
    &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;mealpe-api&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;script&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;./dist/server.js&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;instances&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;max&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="c1"&gt;// Scale to all available CPU cores&lt;/span&gt;
    &lt;span class="na"&gt;exec_mode&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;cluster&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;max_memory_restart&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;500M&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;listen_timeout&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;10000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="c1"&gt;// Wait 10s for boot signal&lt;/span&gt;
    &lt;span class="na"&gt;kill_timeout&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;5000&lt;/span&gt; &lt;span class="c1"&gt;// Wait 5s for clean close&lt;/span&gt;
  &lt;span class="p"&gt;}]&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;instances: 'max'&lt;/code&gt;&lt;/strong&gt; : Spawns a process on each CPU core.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;listen_timeout&lt;/code&gt;&lt;/strong&gt; : Instructs PM2 to wait for a database connection or socket handshake before marking a new process as "Online."&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;kill_timeout&lt;/code&gt;&lt;/strong&gt; : Gives active processes 5 seconds to wrap up in-flight REST queries before forcing a close.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  2. Transitioning to Graceful Reloads
&lt;/h2&gt;

&lt;p&gt;Instead of calling &lt;code&gt;pm2 restart&lt;/code&gt; (which kills all instances simultaneously), we transition to &lt;code&gt;pm2 reload&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Reload initiates a &lt;strong&gt;rolling update&lt;/strong&gt; : it spawns a new instance, waits for it to become online, then safely turns down an old instance. It repeats this pattern process-by-process, maintaining maximum API capacity:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Production deploy script (deploy.sh)&lt;/span&gt;
&lt;span class="c"&gt;#!/bin/bash&lt;/span&gt;
&lt;span class="nb"&gt;set&lt;/span&gt; &lt;span class="nt"&gt;-e&lt;/span&gt;

&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Pulling latest branch code..."&lt;/span&gt;
git pull origin main

&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Installing production-only dependencies..."&lt;/span&gt;
npm ci &lt;span class="nt"&gt;--production&lt;/span&gt;

&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Executing rolling reload..."&lt;/span&gt;
pm2 reload ecosystem.config.js &lt;span class="nt"&gt;--update-env&lt;/span&gt;

&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Deploy successfully completed!"&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  3. Implementing Application Graceful Shutdowns
&lt;/h2&gt;

&lt;p&gt;PM2 sends a &lt;code&gt;SIGINT&lt;/code&gt; trigger to your process before shutting it down. If your application doesn't handle this signal, it terminates instantly, dropping all connections mid-transaction.&lt;/p&gt;

&lt;p&gt;You must catch the &lt;code&gt;SIGINT&lt;/code&gt; event, close the HTTP port to block new inbound traffic, finish active queries, and release database pools:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// ✅ Professional graceful shutdown hook in server.js&lt;/span&gt;
&lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;on&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;SIGINT&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;SIGINT signal received. Starting graceful shutdown sequence...&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

  &lt;span class="c1"&gt;// Stop the HTTP server from accepting new socket sessions&lt;/span&gt;
  &lt;span class="nx"&gt;server&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;close&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;HTTP server successfully closed.&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="c1"&gt;// Release database connection pools cleanly&lt;/span&gt;
      &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;end&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
      &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Database pools released. Exiting cleanly.&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
      &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;exit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;catch &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;err&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Error during database teardown:&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;err&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
      &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;exit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;

  &lt;span class="c1"&gt;// Force close after a 6-second timeout block if connections hang&lt;/span&gt;
  &lt;span class="nf"&gt;setTimeout&lt;/span&gt;&lt;span class="p"&gt;(()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;warn&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Forced shutdown active: connections did not close in time.&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;exit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="mi"&gt;6000&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  4. Configuring NGINX Load-Balancing
&lt;/h2&gt;

&lt;p&gt;NGINX is our front-door gateway. The key configurations for zero-downtime routing include setting up an &lt;code&gt;upstream&lt;/code&gt; pool and instructing NGINX to pass traffic to active processes on failures:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight nginx"&gt;&lt;code&gt;&lt;span class="k"&gt;upstream&lt;/span&gt; &lt;span class="s"&gt;mealpe_backend&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kn"&gt;server&lt;/span&gt; &lt;span class="nf"&gt;127.0.0.1&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;3000&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="kn"&gt;keepalive&lt;/span&gt; &lt;span class="mi"&gt;64&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="c1"&gt;# Keep connection channels open to reduce latency&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;server&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kn"&gt;listen&lt;/span&gt; &lt;span class="mi"&gt;443&lt;/span&gt; &lt;span class="s"&gt;ssl&lt;/span&gt; &lt;span class="s"&gt;http2&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="kn"&gt;server_name&lt;/span&gt; &lt;span class="s"&gt;api.mealpe.in&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="kn"&gt;location&lt;/span&gt; &lt;span class="n"&gt;/&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="kn"&gt;proxy_pass&lt;/span&gt; &lt;span class="s"&gt;http://mealpe_backend&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="kn"&gt;proxy_http_version&lt;/span&gt; &lt;span class="mf"&gt;1.1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="kn"&gt;proxy_set_header&lt;/span&gt; &lt;span class="s"&gt;Connection&lt;/span&gt; &lt;span class="s"&gt;""&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="kn"&gt;proxy_set_header&lt;/span&gt; &lt;span class="s"&gt;Host&lt;/span&gt; &lt;span class="nv"&gt;$host&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="kn"&gt;proxy_set_header&lt;/span&gt; &lt;span class="s"&gt;X-Real-IP&lt;/span&gt; &lt;span class="nv"&gt;$remote_addr&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="kn"&gt;proxy_set_header&lt;/span&gt; &lt;span class="s"&gt;X-Forwarded-For&lt;/span&gt; &lt;span class="nv"&gt;$proxy_add_x_forwarded_for&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

        &lt;span class="c1"&gt;# 🚀 Crucial: if one instance is reloading, try next active worker&lt;/span&gt;
        &lt;span class="kn"&gt;proxy_next_upstream&lt;/span&gt; &lt;span class="s"&gt;error&lt;/span&gt; &lt;span class="s"&gt;timeout&lt;/span&gt; &lt;span class="s"&gt;http_502&lt;/span&gt; &lt;span class="s"&gt;http_503&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="kn"&gt;proxy_connect_timeout&lt;/span&gt; &lt;span class="s"&gt;3s&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="kn"&gt;proxy_read_timeout&lt;/span&gt; &lt;span class="s"&gt;15s&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;By adding &lt;code&gt;proxy_next_upstream&lt;/code&gt;, if an instance is in the middle of reloading and drops a connection, NGINX instantly retries the query on a sibling process. The client has zero awareness of the deploy event.&lt;/p&gt;




&lt;h2&gt;
  
  
  5. Webhook CI/CD Automation
&lt;/h2&gt;

&lt;p&gt;To secure execution, we set up a lightweight deployment daemon on our EC2 instance that exposes a secured webhook route. When a PR merges into &lt;code&gt;main&lt;/code&gt; on GitHub, our actions pipeline runs tests, builds the typescript bundle, and pings the webhook:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# GitHub Actions CI deployment step&lt;/span&gt;
&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Trigger Server Deploy Webhook&lt;/span&gt;
  &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;|&lt;/span&gt;
    &lt;span class="s"&gt;curl -X POST \&lt;/span&gt;
      &lt;span class="s"&gt;-H "Authorization: Bearer ${{ secrets.DEPLOY_SECRET }}" \&lt;/span&gt;
      &lt;span class="s"&gt;https://api.mealpe.in/webhooks/deploy&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The daemon checks the bearer token (a shared secret) in constant time before running anything. This triggers the automated &lt;code&gt;deploy.sh&lt;/code&gt; script locally on the server.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Outcome
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;0 seconds&lt;/strong&gt; of user-facing downtime recorded.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;45-second&lt;/strong&gt; deployment pipeline from git merge to production availability.&lt;/li&gt;
&lt;li&gt;Confidence to deploy minor changes and hotfixes safely during standard hours.&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>devops</category>
      <category>javascript</category>
      <category>node</category>
      <category>tutorial</category>
    </item>
  </channel>
</rss>
