<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Munteanu Flavius-Ioan</title>
    <description>The latest articles on DEV Community by Munteanu Flavius-Ioan (@flvmnt).</description>
    <link>https://dev.to/flvmnt</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3780079%2F521b221f-50fa-4afd-858c-d88fbd89f7b7.png</url>
      <title>DEV Community: Munteanu Flavius-Ioan</title>
      <link>https://dev.to/flvmnt</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/flvmnt"/>
    <language>en</language>
    <item>
      <title>I built a tool that runs your Postgres migrations before they hit production</title>
      <dc:creator>Munteanu Flavius-Ioan</dc:creator>
      <pubDate>Sat, 14 Mar 2026 17:02:57 +0000</pubDate>
      <link>https://dev.to/flvmnt/i-built-a-tool-that-runs-your-postgres-migrations-before-they-hit-production-2b22</link>
      <guid>https://dev.to/flvmnt/i-built-a-tool-that-runs-your-postgres-migrations-before-they-hit-production-2b22</guid>
      <description>&lt;p&gt;Last month I shipped &lt;a href="https://pgfence.com" rel="noopener noreferrer"&gt;pgfence&lt;/a&gt;, a CLI that tells you what lock modes your Postgres migrations take and how to rewrite them safely. It works by parsing your SQL and looking up each DDL statement in a lock mode table.&lt;/p&gt;

&lt;p&gt;The feedback I kept getting: "How do I know your lookup table is right?"&lt;/p&gt;

&lt;p&gt;Fair question. So I built trace mode.&lt;/p&gt;

&lt;h2&gt;
  
  
  What trace mode does
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pgfence trace migrations/add-verified.sql
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;pgfence - Trace Report (PostgreSQL 17, Docker)

  migrations/add-verified.sql  [HIGH]

  #  Statement                          Lock Mode         Blocks  Risk    Verified    Duration
  1  ALTER TABLE users ADD COLUMN       ACCESS EXCLUSIVE  R + W   HIGH    Confirmed   2ms
     email_verified BOOLEAN NOT NULL
  2  CREATE INDEX idx ON users(email)   SHARE             W       MEDIUM  Confirmed   1ms

  Trace-Only Findings:
  ! Table rewrite detected on "users" (relfilenode changed)

  === Coverage ===
  Analyzed: 2 statements | Verified: 2/2 | Mismatches: 0 | Trace-only: 1
  Docker: postgres:17-alpine | Container lifetime: 4.2s
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Every statement gets a verification status:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Confirmed&lt;/strong&gt;: pgfence's static prediction matched real Postgres behavior&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Mismatch&lt;/strong&gt;: the prediction was wrong. You see what actually happened.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Trace-only&lt;/strong&gt;: something Postgres did that static analysis can't predict (table rewrites, implicit locks on sequences)&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  How it works
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;pgfence runs static analysis first (all the normal rules and safe rewrite recipes)&lt;/li&gt;
&lt;li&gt;Pulls &lt;code&gt;postgres:17-alpine&lt;/code&gt; and starts a container on a random &lt;code&gt;127.0.0.1&lt;/code&gt; port&lt;/li&gt;
&lt;li&gt;Each statement executes inside &lt;code&gt;BEGIN&lt;/code&gt;/&lt;code&gt;COMMIT&lt;/code&gt; so locks are held when we snapshot&lt;/li&gt;
&lt;li&gt;After each statement, diffs: &lt;code&gt;pg_locks&lt;/code&gt; (lock modes), &lt;code&gt;pg_class.relfilenode&lt;/code&gt; (table rewrites), &lt;code&gt;pg_attribute&lt;/code&gt; (column changes), &lt;code&gt;pg_constraint&lt;/code&gt; (validation state), &lt;code&gt;pg_index&lt;/code&gt; (index validity)&lt;/li&gt;
&lt;li&gt;Static predictions are matched against trace observations&lt;/li&gt;
&lt;li&gt;Container is deleted&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The whole thing takes 3-5 seconds for a typical migration file. The container is ephemeral: random password, &lt;code&gt;127.0.0.1&lt;/code&gt; only, cleaned up even on SIGINT/SIGTERM.&lt;/p&gt;

&lt;h2&gt;
  
  
  The CONCURRENTLY problem
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;CREATE INDEX CONCURRENTLY&lt;/code&gt; is the most common safe migration pattern. It's also the hardest to trace.&lt;/p&gt;

&lt;p&gt;The reason: Postgres rejects &lt;code&gt;CONCURRENTLY&lt;/code&gt; inside a transaction. If you wrap everything in &lt;code&gt;BEGIN&lt;/code&gt;/&lt;code&gt;COMMIT&lt;/code&gt; to hold locks for snapshotting, CONCURRENTLY statements fail.&lt;/p&gt;

&lt;p&gt;Eugene (a Rust-based tool with a similar trace feature) solves this by wrapping everything in a transaction and rolling back. CONCURRENTLY statements are just skipped.&lt;/p&gt;

&lt;p&gt;pgfence takes a different approach. Since the Docker container is disposable, there's no need for transactions on CONCURRENTLY statements. Instead, pgfence opens a second "observer" connection that polls &lt;code&gt;pg_locks&lt;/code&gt; every 50ms while the main connection executes the statement. The observer captures every lock mode that was held during execution.&lt;/p&gt;

&lt;p&gt;This means pgfence is the only tool that can verify the lock behavior of &lt;code&gt;CREATE INDEX CONCURRENTLY&lt;/code&gt;, &lt;code&gt;DROP INDEX CONCURRENTLY&lt;/code&gt;, &lt;code&gt;REINDEX CONCURRENTLY&lt;/code&gt;, and &lt;code&gt;DETACH PARTITION CONCURRENTLY&lt;/code&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  What trace mode catches that static analysis can't
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Table rewrites.&lt;/strong&gt; When Postgres changes a column type, it sometimes rewrites the entire table (new &lt;code&gt;relfilenode&lt;/code&gt;). Static analysis can detect known rewrite patterns, but trace mode sees the actual rewrite happen by diffing &lt;code&gt;pg_class.relfilenode&lt;/code&gt; before and after.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Implicit locks.&lt;/strong&gt; Adding a column to a table with a serial column implicitly locks the sequence. Adding a foreign key locks the referenced table. These cascade effects are visible in &lt;code&gt;pg_locks&lt;/code&gt; but not in the DDL syntax.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Version-specific behavior.&lt;/strong&gt; &lt;code&gt;ALTER TYPE ADD VALUE&lt;/code&gt; takes EXCLUSIVE on PG12+ but ACCESS EXCLUSIVE on PG11. &lt;code&gt;ADD COLUMN ... DEFAULT&lt;/code&gt; is instant on PG11+ but rewrites the table on PG10. Trace mode tests against the exact version you specify.&lt;/p&gt;

&lt;h2&gt;
  
  
  CI integration
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pgfence trace &lt;span class="nt"&gt;--ci&lt;/span&gt; &lt;span class="nt"&gt;--max-risk&lt;/span&gt; medium migrations/&lt;span class="k"&gt;*&lt;/span&gt;.sql
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Exit code 1 if any check exceeds the risk threshold, any mismatch between prediction and reality, or any execution error. Mismatches are CI failures because a mismatch means the static analysis is wrong for that statement.&lt;/p&gt;

&lt;h2&gt;
  
  
  Try it
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npm &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-D&lt;/span&gt; @flvmnt/pgfence
pgfence trace migrations/&lt;span class="k"&gt;*&lt;/span&gt;.sql
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Requires Docker. Works with raw SQL, TypeORM, Prisma, Knex, Drizzle, and Sequelize migrations.&lt;/p&gt;

&lt;p&gt;If you find a mismatch between pgfence's static analysis and what trace mode observes, please &lt;a href="https://github.com/flvmnt/pgfence/issues" rel="noopener noreferrer"&gt;open an issue&lt;/a&gt;. Every mismatch report helps make the static analysis more accurate.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/flvmnt/pgfence" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt; | &lt;a href="https://pgfence.com" rel="noopener noreferrer"&gt;Website&lt;/a&gt; | &lt;a href="https://www.npmjs.com/package/@flvmnt/pgfence" rel="noopener noreferrer"&gt;npm&lt;/a&gt;&lt;/p&gt;

</description>
      <category>postgres</category>
      <category>database</category>
      <category>docker</category>
      <category>typescript</category>
    </item>
    <item>
      <title>The ALTER TABLE that took down our API for 6 minutes</title>
      <dc:creator>Munteanu Flavius-Ioan</dc:creator>
      <pubDate>Wed, 25 Feb 2026 18:40:33 +0000</pubDate>
      <link>https://dev.to/flvmnt/the-alter-table-that-took-down-our-api-for-6-minutes-50ig</link>
      <guid>https://dev.to/flvmnt/the-alter-table-that-took-down-our-api-for-6-minutes-50ig</guid>
      <description>&lt;p&gt;Last Tuesday at 2:47 PM, we deployed a migration that looked completely innocent:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;ALTER&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;users&lt;/span&gt; &lt;span class="k"&gt;ADD&lt;/span&gt; &lt;span class="k"&gt;COLUMN&lt;/span&gt; &lt;span class="n"&gt;email_verified&lt;/span&gt; &lt;span class="nb"&gt;boolean&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;INDEX&lt;/span&gt; &lt;span class="n"&gt;idx_users_email&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;users&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;email&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Two statements. Nothing exotic. The kind of thing you've written a hundred times.&lt;/p&gt;

&lt;p&gt;At 2:48 PM, every API endpoint that touched the &lt;code&gt;users&lt;/code&gt; table started timing out. Our health checks went red. PagerDuty fired. Customers couldn't log in.&lt;/p&gt;

&lt;p&gt;At 2:54 PM — six minutes later — the migration finished and everything recovered on its own.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The root cause wasn't a bug. It was a lock.&lt;/strong&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  What actually happened
&lt;/h2&gt;

&lt;p&gt;Every DDL statement in Postgres acquires a lock on the table it modifies. The lock type depends on the statement:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Statement&lt;/th&gt;
&lt;th&gt;Lock Mode&lt;/th&gt;
&lt;th&gt;What it blocks&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;code&gt;ADD COLUMN ... NOT NULL&lt;/code&gt; (no DEFAULT)&lt;/td&gt;
&lt;td&gt;ACCESS EXCLUSIVE&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;Everything&lt;/strong&gt; — reads, writes, DDL&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;code&gt;CREATE INDEX&lt;/code&gt; (no CONCURRENTLY)&lt;/td&gt;
&lt;td&gt;SHARE&lt;/td&gt;
&lt;td&gt;All writes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;SELECT&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;ACCESS SHARE&lt;/td&gt;
&lt;td&gt;Nothing&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;code&gt;ACCESS EXCLUSIVE&lt;/code&gt; is the nuclear option. It blocks every other operation on the table — including simple &lt;code&gt;SELECT&lt;/code&gt; queries — until the DDL completes.&lt;/p&gt;

&lt;p&gt;On a small table, this takes milliseconds. On our &lt;code&gt;users&lt;/code&gt; table with 8 million rows, the &lt;code&gt;ADD COLUMN ... NOT NULL&lt;/code&gt; without a default forces Postgres to rewrite the entire table while holding that lock. Every query stacks up in the lock queue. The queue backs up into connection pool exhaustion. The pool exhaustion cascades to every service.&lt;/p&gt;

&lt;p&gt;Six minutes of downtime from a two-line migration.&lt;/p&gt;

&lt;h2&gt;
  
  
  The fix we never should have needed
&lt;/h2&gt;

&lt;p&gt;The safe way to add a NOT NULL column is a multi-step expand/contract pattern:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Step 1: Add the column as nullable (instant, no rewrite)&lt;/span&gt;
&lt;span class="k"&gt;ALTER&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;users&lt;/span&gt; &lt;span class="k"&gt;ADD&lt;/span&gt; &lt;span class="k"&gt;COLUMN&lt;/span&gt; &lt;span class="n"&gt;IF&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;EXISTS&lt;/span&gt; &lt;span class="n"&gt;email_verified&lt;/span&gt; &lt;span class="nb"&gt;boolean&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;-- Step 2: Backfill out-of-band in batches (not in a migration)&lt;/span&gt;
&lt;span class="c1"&gt;-- UPDATE users SET email_verified = false WHERE email_verified IS NULL LIMIT 1000;&lt;/span&gt;

&lt;span class="c1"&gt;-- Step 3: Add the constraint without validating (brief lock)&lt;/span&gt;
&lt;span class="k"&gt;ALTER&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;users&lt;/span&gt; &lt;span class="k"&gt;ADD&lt;/span&gt; &lt;span class="k"&gt;CONSTRAINT&lt;/span&gt; &lt;span class="n"&gt;chk_nn&lt;/span&gt;
  &lt;span class="k"&gt;CHECK&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;email_verified&lt;/span&gt; &lt;span class="k"&gt;IS&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;VALID&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;-- Step 4: Validate (reads table but doesn't block writes)&lt;/span&gt;
&lt;span class="k"&gt;ALTER&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;users&lt;/span&gt; &lt;span class="n"&gt;VALIDATE&lt;/span&gt; &lt;span class="k"&gt;CONSTRAINT&lt;/span&gt; &lt;span class="n"&gt;chk_nn&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;-- Step 5: Now it's safe to enforce&lt;/span&gt;
&lt;span class="k"&gt;ALTER&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;users&lt;/span&gt; &lt;span class="k"&gt;ALTER&lt;/span&gt; &lt;span class="k"&gt;COLUMN&lt;/span&gt; &lt;span class="n"&gt;email_verified&lt;/span&gt; &lt;span class="k"&gt;SET&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;ALTER&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;users&lt;/span&gt; &lt;span class="k"&gt;DROP&lt;/span&gt; &lt;span class="k"&gt;CONSTRAINT&lt;/span&gt; &lt;span class="n"&gt;chk_nn&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And that &lt;code&gt;CREATE INDEX&lt;/code&gt;? Should have been:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;INDEX&lt;/span&gt; &lt;span class="n"&gt;CONCURRENTLY&lt;/span&gt; &lt;span class="n"&gt;IF&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;EXISTS&lt;/span&gt; &lt;span class="n"&gt;idx_users_email&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;users&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;email&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;CONCURRENTLY&lt;/code&gt; takes longer but only requires a &lt;code&gt;SHARE UPDATE EXCLUSIVE&lt;/code&gt; lock — it allows reads &lt;em&gt;and&lt;/em&gt; writes to continue.&lt;/p&gt;

&lt;p&gt;Every experienced Postgres DBA knows these patterns. The problem is that migrations are written by application developers who don't. And code review doesn't catch lock modes — they're not visible in the SQL.&lt;/p&gt;

&lt;h2&gt;
  
  
  pgfence: catch it before your users do
&lt;/h2&gt;

&lt;p&gt;We built &lt;a href="https://pgfence.com" rel="noopener noreferrer"&gt;pgfence&lt;/a&gt; to make these problems visible before they reach production. It's a CLI that analyzes your migration files and reports exactly what each statement locks, what it blocks, and how to fix it.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;npx @flvmnt/pgfence analyze migrations/add_verified.sql
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Output:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;migrations/add_verified.sql  [HIGH]
Lock: ACCESS EXCLUSIVE | Blocks: reads+writes+DDL

#  Statement                                        Lock Mode         Blocks           Risk
1  ALTER TABLE users ADD COLUMN email_verified ...   ACCESS EXCLUSIVE  reads,writes,DDL HIGH
2  CREATE INDEX idx_users_email ON users(email)      SHARE             writes,DDL       MEDIUM

Safe Rewrite Recipe:
  add-column-not-null-no-default: Add nullable column, backfill, then add NOT NULL constraint

    ALTER TABLE users ADD COLUMN IF NOT EXISTS email_verified boolean;
    -- Backfill out-of-band in batches
    ALTER TABLE users ADD CONSTRAINT chk_nn CHECK (email_verified IS NOT NULL) NOT VALID;
    ALTER TABLE users VALIDATE CONSTRAINT chk_nn;
    ALTER TABLE users ALTER COLUMN email_verified SET NOT NULL;
    ALTER TABLE users DROP CONSTRAINT chk_nn;

Policy Violations:
  ERROR  Missing SET lock_timeout — lock queue death spiral risk
  → Add SET lock_timeout = '2s'; at the start of the migration

Analyzed: 2 statements | Unanalyzable: 0 | Coverage: 100%
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;It tells you:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;What lock mode&lt;/strong&gt; each statement takes&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;What it blocks&lt;/strong&gt; (reads, writes, DDL)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The risk level&lt;/strong&gt; (LOW / MEDIUM / HIGH / CRITICAL)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The safe rewrite&lt;/strong&gt; — the exact SQL to use instead&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Policy violations&lt;/strong&gt; — missing &lt;code&gt;lock_timeout&lt;/code&gt;, &lt;code&gt;CONCURRENTLY&lt;/code&gt; inside a transaction, etc.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  How it works
&lt;/h2&gt;

&lt;p&gt;pgfence doesn't use regex to guess at SQL patterns. It uses &lt;a href="https://github.com/pganalyze/libpg_query" rel="noopener noreferrer"&gt;libpg_query&lt;/a&gt; — PostgreSQL's actual parser, compiled to a C library and exposed via Node.js bindings. The same parser that Postgres itself uses to understand your SQL.&lt;/p&gt;

&lt;p&gt;This means it handles edge cases that regex-based tools miss:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- pgfence correctly identifies this as safe (PG11+ metadata-only):&lt;/span&gt;
&lt;span class="k"&gt;ALTER&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;t&lt;/span&gt; &lt;span class="k"&gt;ADD&lt;/span&gt; &lt;span class="k"&gt;COLUMN&lt;/span&gt; &lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="nb"&gt;text&lt;/span&gt; &lt;span class="k"&gt;DEFAULT&lt;/span&gt; &lt;span class="s1"&gt;'active'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;-- And this as dangerous (volatile default forces rewrite):&lt;/span&gt;
&lt;span class="k"&gt;ALTER&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;t&lt;/span&gt; &lt;span class="k"&gt;ADD&lt;/span&gt; &lt;span class="k"&gt;COLUMN&lt;/span&gt; &lt;span class="n"&gt;created_at&lt;/span&gt; &lt;span class="n"&gt;timestamptz&lt;/span&gt; &lt;span class="k"&gt;DEFAULT&lt;/span&gt; &lt;span class="n"&gt;now&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  38 checks across the full DDL surface
&lt;/h3&gt;

&lt;p&gt;Not just &lt;code&gt;ADD COLUMN&lt;/code&gt; and &lt;code&gt;CREATE INDEX&lt;/code&gt;. pgfence covers:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Critical&lt;/strong&gt;: &lt;code&gt;DROP TABLE&lt;/code&gt;, &lt;code&gt;TRUNCATE&lt;/code&gt;, &lt;code&gt;REINDEX SCHEMA/DATABASE&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;High&lt;/strong&gt;: &lt;code&gt;ADD FOREIGN KEY&lt;/code&gt; without &lt;code&gt;NOT VALID&lt;/code&gt;, &lt;code&gt;ADD UNIQUE&lt;/code&gt; without concurrent index, &lt;code&gt;VACUUM FULL&lt;/code&gt;, &lt;code&gt;ATTACH PARTITION&lt;/code&gt;, &lt;code&gt;REFRESH MATERIALIZED VIEW&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Medium&lt;/strong&gt;: &lt;code&gt;SET NOT NULL&lt;/code&gt; without pre-validated constraint, &lt;code&gt;ALTER TYPE ADD VALUE&lt;/code&gt; on PG &amp;lt; 12, &lt;code&gt;CREATE/DROP TRIGGER&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Low&lt;/strong&gt;: Safe patterns it recognizes and doesn't flag — &lt;code&gt;ADD COLUMN DEFAULT &amp;lt;constant&amp;gt;&lt;/code&gt; on PG11+, &lt;code&gt;ADD UNIQUE USING INDEX&lt;/code&gt;, &lt;code&gt;DETACH PARTITION CONCURRENTLY&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Works with your ORM
&lt;/h3&gt;

&lt;p&gt;pgfence extracts SQL from ORM migration files:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# TypeORM&lt;/span&gt;
pgfence analyze &lt;span class="nt"&gt;--format&lt;/span&gt; typeorm src/migrations/&lt;span class="k"&gt;*&lt;/span&gt;.ts

&lt;span class="c"&gt;# Knex&lt;/span&gt;
pgfence analyze &lt;span class="nt"&gt;--format&lt;/span&gt; knex migrations/&lt;span class="k"&gt;*&lt;/span&gt;.ts

&lt;span class="c"&gt;# Prisma (analyzes the generated SQL files)&lt;/span&gt;
pgfence analyze prisma/migrations/&lt;span class="k"&gt;**&lt;/span&gt;/migration.sql

&lt;span class="c"&gt;# Plain SQL&lt;/span&gt;
pgfence analyze migrations/&lt;span class="k"&gt;*&lt;/span&gt;.sql
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;It handles &lt;code&gt;queryRunner.query()&lt;/code&gt; calls in TypeORM, &lt;code&gt;knex.raw()&lt;/code&gt; and builder chains in Knex, &lt;code&gt;queryInterface&lt;/code&gt; calls in Sequelize, and Drizzle migrations.&lt;/p&gt;

&lt;h3&gt;
  
  
  One line in CI
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# .github/workflows/migration-check.yml&lt;/span&gt;
&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Check migrations&lt;/span&gt;
  &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;npx @flvmnt/pgfence analyze --ci --max-risk medium migrations/*.sql&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Exit code 1 on HIGH risk or above. The build fails before the migration reaches production.&lt;/p&gt;

&lt;h3&gt;
  
  
  Optional: table-size-aware scoring
&lt;/h3&gt;

&lt;p&gt;A &lt;code&gt;CREATE INDEX&lt;/code&gt; on a 500-row lookup table is fine. The same statement on a 50M-row events table is a production incident. pgfence can adjust risk levels based on table sizes:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Export table stats from your read replica (you control the connection)&lt;/span&gt;
pgfence extract-stats &lt;span class="nt"&gt;--db-url&lt;/span&gt; postgres://readonly@replica/mydb &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; pgfence-stats.json

&lt;span class="c"&gt;# Analyze with size awareness&lt;/span&gt;
pgfence analyze &lt;span class="nt"&gt;--stats-file&lt;/span&gt; pgfence-stats.json migrations/&lt;span class="k"&gt;*&lt;/span&gt;.sql
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;pgfence never asks for database credentials directly. The stats export runs in your CI environment using your existing secrets.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why not Eugene / Squawk / strong_migrations?
&lt;/h2&gt;

&lt;p&gt;Those are excellent tools. Here's what's different:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;pgfence&lt;/th&gt;
&lt;th&gt;Eugene&lt;/th&gt;
&lt;th&gt;Squawk&lt;/th&gt;
&lt;th&gt;strong_migrations&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Language&lt;/td&gt;
&lt;td&gt;TypeScript&lt;/td&gt;
&lt;td&gt;Rust&lt;/td&gt;
&lt;td&gt;Rust&lt;/td&gt;
&lt;td&gt;Ruby&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ORM extraction&lt;/td&gt;
&lt;td&gt;TypeORM, Knex, Prisma, Sequelize, Drizzle&lt;/td&gt;
&lt;td&gt;SQL only&lt;/td&gt;
&lt;td&gt;SQL only&lt;/td&gt;
&lt;td&gt;Rails only&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Safe rewrite output&lt;/td&gt;
&lt;td&gt;Full SQL recipes&lt;/td&gt;
&lt;td&gt;Warnings only&lt;/td&gt;
&lt;td&gt;Warnings only&lt;/td&gt;
&lt;td&gt;Suggestions&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Table-size scoring&lt;/td&gt;
&lt;td&gt;Yes (via stats snapshot)&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Lock mode mapping&lt;/td&gt;
&lt;td&gt;All 8 PG lock modes&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;Partial&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;If you're a Rails shop, strong_migrations is the right choice. If you're in the Node/TypeScript ecosystem with TypeORM or Knex or Prisma — pgfence was built for you.&lt;/p&gt;

&lt;h2&gt;
  
  
  Get started
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npm &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-D&lt;/span&gt; @flvmnt/pgfence
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Analyze your existing migrations:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npx pgfence analyze migrations/&lt;span class="k"&gt;*&lt;/span&gt;.sql
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You'll probably find at least one statement you didn't know was dangerous.&lt;/p&gt;




&lt;p&gt;&lt;a href="https://github.com/flvmnt/pgfence" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt; · &lt;a href="https://pgfence.com/docs/introduction" rel="noopener noreferrer"&gt;Docs&lt;/a&gt; · &lt;a href="https://www.npmjs.com/package/@flvmnt/pgfence" rel="noopener noreferrer"&gt;npm&lt;/a&gt;&lt;/p&gt;

</description>
      <category>postgres</category>
      <category>database</category>
      <category>devops</category>
      <category>typescript</category>
    </item>
    <item>
      <title>How we stopped ORM migrations from taking down our Postgres database</title>
      <dc:creator>Munteanu Flavius-Ioan</dc:creator>
      <pubDate>Sun, 22 Feb 2026 17:00:39 +0000</pubDate>
      <link>https://dev.to/flvmnt/how-we-stopped-orm-migrations-from-taking-down-our-postgres-database-2925</link>
      <guid>https://dev.to/flvmnt/how-we-stopped-orm-migrations-from-taking-down-our-postgres-database-2925</guid>
      <description>&lt;p&gt;If you've ever run a database migration that applied a seemingly minor change, like &lt;code&gt;ALTER TABLE users ADD COLUMN is_verified BOOLEAN DEFAULT false&lt;/code&gt;, only to watch your API response times spike and connection pools exhaust, you've met the Postgres &lt;code&gt;ACCESS EXCLUSIVE&lt;/code&gt; lock.&lt;/p&gt;

&lt;p&gt;Modern ORMs like TypeORM, Sequelize, Prisma, and Drizzle hide the underlying Postgres locking mechanics. When developers can't see the DDL statements their ORMs are running, they can't optimize them.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Lock Queue Death Spiral
&lt;/h3&gt;

&lt;p&gt;Postgres requires strict locks for schema changes. When a migration runs an &lt;code&gt;ALTER TABLE&lt;/code&gt; command, it normally requires an &lt;code&gt;ACCESS EXCLUSIVE&lt;/code&gt; lock.&lt;br&gt;
If there's currently a 30-second reporting query running on that table, the migration has to wait in line.&lt;/p&gt;

&lt;p&gt;While the migration is waiting, &lt;em&gt;every other incoming production query&lt;/em&gt; behind it is also forced to wait. Suddenly, your app is down.&lt;/p&gt;
&lt;h3&gt;
  
  
  Introducing &lt;code&gt;pgfence&lt;/code&gt;
&lt;/h3&gt;

&lt;p&gt;To solve this, I built &lt;a href="https://pgfence.com" rel="noopener noreferrer"&gt;pgfence&lt;/a&gt;, a source-available TypeScript CLI designed specifically for the Node.js ecosystem. It's the first tool in this space built natively for Node. The alternatives (strong_migrations, Eugene, Squawk) are Ruby or Rust, which creates real friction if your stack is TypeScript.&lt;/p&gt;

&lt;p&gt;Unlike other linters, &lt;code&gt;pgfence&lt;/code&gt; parses the Abstract Syntax Trees (ASTs) of your ORM's migration files (supporting &lt;code&gt;.ts&lt;/code&gt; files from TypeORM, Knex, Sequelize, and Prisma). It statically extracts the SQL and evaluates the risk before you ever merge to main.&lt;/p&gt;
&lt;h4&gt;
  
  
  1. Predicting Lock Modes
&lt;/h4&gt;

&lt;p&gt;&lt;code&gt;pgfence&lt;/code&gt; matches every DDL command to the Postgres lock matrix. It prints out exactly what locks are being grabbed and whether they will block &lt;strong&gt;Reads&lt;/strong&gt;, &lt;strong&gt;Writes&lt;/strong&gt;, or both.&lt;/p&gt;
&lt;h4&gt;
  
  
  2. Enforcing Timeout Policies
&lt;/h4&gt;

&lt;p&gt;If you don't explicitly declare &lt;code&gt;SET lock_timeout = '2s'&lt;/code&gt; inside a migration, a stuck migration will bring down your app. &lt;code&gt;pgfence&lt;/code&gt; scans your migrations and fails CI if timeouts are omitted.&lt;/p&gt;
&lt;h4&gt;
  
  
  3. Giving you the fix
&lt;/h4&gt;

&lt;p&gt;&lt;code&gt;pgfence&lt;/code&gt; provides "Safe Rewrite Recipes." If you try to add a column with a volatile default, it gives you the exact 3-step zero-downtime expand/contract migration to use instead.&lt;/p&gt;
&lt;h3&gt;
  
  
  Try it out
&lt;/h3&gt;

&lt;p&gt;You can integrate &lt;code&gt;pgfence&lt;/code&gt; into your GitHub Actions today. It's free and open-source.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npx pgfence analyze migrations/&lt;span class="k"&gt;*&lt;/span&gt;.sql
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Website: &lt;a href="https://pgfence.com" rel="noopener noreferrer"&gt;pgfence.com&lt;/a&gt;&lt;br&gt;
GitHub: &lt;a href="https://github.com/flvmnt/pgfence" rel="noopener noreferrer"&gt;flvmnt/pgfence&lt;/a&gt;&lt;/p&gt;

</description>
      <category>postgres</category>
      <category>database</category>
      <category>devops</category>
      <category>typescript</category>
    </item>
    <item>
      <title>Your NestJS Idempotency Layer is Probably Broken</title>
      <dc:creator>Munteanu Flavius-Ioan</dc:creator>
      <pubDate>Wed, 18 Feb 2026 22:40:46 +0000</pubDate>
      <link>https://dev.to/flvmnt/your-nestjs-idempotency-layer-is-probably-broken-l9o</link>
      <guid>https://dev.to/flvmnt/your-nestjs-idempotency-layer-is-probably-broken-l9o</guid>
      <description>&lt;p&gt;Most idempotency implementations in NestJS apps look the same: hash the key, check Redis, return cached response or proceed. 40 lines, done.&lt;/p&gt;

&lt;p&gt;I'm building a booking platform with NestJS - concurrent mutations against shared resources, money involved. While writing the idempotency layer I kept finding edge cases that the simple version doesn't handle. There are at least five, and most of them are exploitable.&lt;/p&gt;




&lt;h2&gt;
  
  
  The version everyone ships
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="p"&gt;@&lt;/span&gt;&lt;span class="nd"&gt;Injectable&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;IdempotencyInterceptor&lt;/span&gt; &lt;span class="k"&gt;implements&lt;/span&gt; &lt;span class="nx"&gt;NestInterceptor&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nf"&gt;constructor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;private&lt;/span&gt; &lt;span class="k"&gt;readonly&lt;/span&gt; &lt;span class="nx"&gt;redis&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;Redis&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{}&lt;/span&gt;

  &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="nf"&gt;intercept&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;context&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;ExecutionContext&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;next&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;CallHandler&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;req&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;context&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;switchToHttp&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;getRequest&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;key&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;idempotency-key&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
    &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;key&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;next&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;handle&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;cached&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;redis&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`idem:&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;key&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;cached&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="k"&gt;of&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;parse&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;cached&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;

    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;lastValueFrom&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;next&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;handle&lt;/span&gt;&lt;span class="p"&gt;());&lt;/span&gt;
    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;redis&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`idem:&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;key&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stringify&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;EX&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;86400&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="k"&gt;of&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;It has at least five holes.&lt;/p&gt;




&lt;h2&gt;
  
  
  Keys aren't scoped
&lt;/h2&gt;

&lt;p&gt;The raw idempotency key goes straight into Redis. So:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;User A sends &lt;code&gt;Idempotency-Key: abc123&lt;/code&gt; to &lt;code&gt;POST /orders&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;User B sends &lt;code&gt;Idempotency-Key: abc123&lt;/code&gt; to &lt;code&gt;POST /payments&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;User B gets User A's cached order response&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Unlikely? Sure. But "unlikely" in security means "will happen at scale, probably during an incident when you're already stressed."&lt;/p&gt;

&lt;p&gt;Keys need to be scoped to the actor and the endpoint:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;buildRedisKey&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;actorId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;method&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;path&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;idempotencyKey&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;keyHash&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;createHash&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;sha256&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;update&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;idempotencyKey&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;digest&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;hex&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="s2"&gt;`idempotency:&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;actorId&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;:&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;method&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;:&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;path&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;:&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;keyHash&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The idempotency key gets hashed before going into the Redis key. Not for secrecy - it prevents injection of Redis key separators (&lt;code&gt;:&lt;/code&gt;) in user-supplied values and normalizes key length.&lt;/p&gt;




&lt;h2&gt;
  
  
  Payload switching
&lt;/h2&gt;

&lt;p&gt;User sends &lt;code&gt;Idempotency-Key: order-1&lt;/code&gt; with &lt;code&gt;{ item: "book", quantity: 1 }&lt;/code&gt;. Gets a 201. Sends the same key with &lt;code&gt;{ item: "laptop", quantity: 100 }&lt;/code&gt;. The naive version replays the cached 201. The user sees "order confirmed" for a laptop that was never actually ordered.&lt;/p&gt;

&lt;p&gt;The scarier version: an attacker probes with a throwaway payload, waits for the cache to expire, then replays the key with an expensive one.&lt;/p&gt;

&lt;p&gt;Hash the body, reject mismatches.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;hashBody&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;body&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;unknown&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;stable&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stringify&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;body&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;Object&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;keys&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;body&lt;/span&gt; &lt;span class="o"&gt;??&lt;/span&gt; &lt;span class="p"&gt;{}).&lt;/span&gt;&lt;span class="nf"&gt;sort&lt;/span&gt;&lt;span class="p"&gt;());&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;createHash&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;sha256&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;update&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;stable&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;digest&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;hex&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c1"&gt;// On cache hit:&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;cached&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;parse&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;redis&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;redisKey&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
&lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;cached&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;bodyHash&lt;/span&gt; &lt;span class="o"&gt;!==&lt;/span&gt; &lt;span class="nf"&gt;hashBody&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;body&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="c1"&gt;// Extend TTL - don't let them wait it out&lt;/span&gt;
  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;redis&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;expire&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;redisKey&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;86400&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;ConflictException&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;code&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;IDEMPOTENCY_CONFLICT&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;message&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;This idempotency key was already used with a different request body.&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The TTL extension on conflict matters. Without it, the attacker just waits for the entry to expire and tries again.&lt;/p&gt;




&lt;h2&gt;
  
  
  The concurrent first-use race
&lt;/h2&gt;

&lt;p&gt;Two identical requests arrive within the same millisecond. Both check Redis. Both miss. Both execute the mutation. Double booking.&lt;/p&gt;

&lt;p&gt;Your database constraints might save you (unique indexes, optimistic locking), but "might" isn't a word I want near payment processing.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;lockKey&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;redisKey&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;:lock`&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;acquired&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;redis&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;lockKey&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;1&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;EX&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;NX&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;acquired&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;sleep&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;cached&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;redis&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;redisKey&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;cached&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;replay&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;cached&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="c1"&gt;// First request still running or crashed. Let this one through -&lt;/span&gt;
  &lt;span class="c1"&gt;// the DB-level constraints are the final safety net.&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;SET NX&lt;/code&gt; is atomic. Only one request wins. The loser waits 100ms, checks if the winner cached a result, and either replays it or proceeds (because the winner might have crashed, and the 5-second lock TTL ensures we don't deadlock).&lt;/p&gt;

&lt;p&gt;This is defense in depth: the idempotency layer catches the common case, and the database catches the rest. Neither alone is enough.&lt;/p&gt;




&lt;h2&gt;
  
  
  Not all errors should be cached
&lt;/h2&gt;

&lt;p&gt;The naive version caches every response for 24 hours. Including 503s, 429s, and 408s.&lt;/p&gt;

&lt;p&gt;So your server has a brief Redis hiccup, returns a 503 to one request, and now every retry of that idempotency key for the next 24 hours replays "Service Unavailable." The user literally cannot complete their action.&lt;/p&gt;

&lt;p&gt;Different status codes need different treatment:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;getTtlForStatus&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;statusCode&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;statusCode&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="mi"&gt;200&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nx"&gt;statusCode&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mi"&gt;300&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;14400&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="c1"&gt;// 4h - it worked, cache it&lt;/span&gt;

  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;statusCode&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="mi"&gt;400&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nx"&gt;statusCode&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mi"&gt;500&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;statusCode&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="mi"&gt;409&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;          &lt;span class="c1"&gt;// Brief dampener, not a wall&lt;/span&gt;
    &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="mi"&gt;429&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;408&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;423&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nf"&gt;includes&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;statusCode&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="c1"&gt;// Retryable, don't cache&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;14400&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;                              &lt;span class="c1"&gt;// 400, 422 etc - retrying won't fix bad input&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;statusCode&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="mi"&gt;500&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;statusCode&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="mi"&gt;503&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;       &lt;span class="c1"&gt;// "Try again" means let them try again&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;                                 &lt;span class="c1"&gt;// Brief thundering-herd protection&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The 409 with a 2-second TTL is worth explaining: when two requests race and the loser gets a DB conflict, you want a tiny dampener to absorb the immediate retry storm, but nothing longer. The client needs to be able to recover.&lt;/p&gt;




&lt;h2&gt;
  
  
  Success preservation
&lt;/h2&gt;

&lt;p&gt;This one is subtle.&lt;/p&gt;

&lt;p&gt;Request A and Request B arrive simultaneously, same key, same body. A gets the lock, runs the mutation, gets 201, starts writing to Redis. B's lock attempt timed out, it ran anyway, got a 409 from the database (unique constraint), and writes its result to Redis.&lt;/p&gt;

&lt;p&gt;If B's write lands after A's, the cache now holds a 409 for a mutation that succeeded. Every future replay tells the client their operation failed. They retry with a new idempotency key, and now you have a duplicate.&lt;/p&gt;

&lt;p&gt;The rule is simple: never let an error overwrite a success.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;cacheResponse&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;redisKey&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;statusCode&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;body&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;unknown&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;bodyHash&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;ttl&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;getTtlForStatus&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;statusCode&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;ttl&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;isSuccess&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;statusCode&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="mi"&gt;200&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nx"&gt;statusCode&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mi"&gt;300&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;isSuccess&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;existing&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;redis&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;redisKey&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;existing&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;parsed&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;parse&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;existing&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
      &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;parsed&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;statusCode&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="mi"&gt;200&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nx"&gt;parsed&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;statusCode&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mi"&gt;300&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="c1"&gt;// A 201 is already cached. Don't touch it.&lt;/span&gt;
      &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;redis&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;redisKey&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stringify&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="nx"&gt;statusCode&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;body&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;bodyHash&lt;/span&gt; &lt;span class="p"&gt;}),&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;EX&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;ttl&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Cardinality attacks
&lt;/h2&gt;

&lt;p&gt;Someone writes a script that sends a million requests with unique idempotency keys. The mutations all fail validation, but each one creates a Redis cache entry that sits there for 4 hours. Your Redis instance fills up.&lt;/p&gt;

&lt;p&gt;Per-actor quota, bucketed by time window:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;checkQuota&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;actorId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="nb"&gt;Promise&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;boolean&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;bucket&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;Math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;floor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;now&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="mi"&gt;60000&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;quotaKey&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;`idem-quota:&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;actorId&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;:&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;bucket&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;count&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;redis&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;incr&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;quotaKey&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;count&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;redis&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;expire&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;quotaKey&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;120&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="c1"&gt;// 2x window for clock skew&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;count&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&lt;/span&gt; &lt;span class="mi"&gt;30&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;INCR + EXPIRE instead of SETEX because INCR atomically creates the key if it doesn't exist and increments it. SETEX would reset the counter on every call. The 120-second TTL (double the 60-second window) ensures the key outlives its bucket.&lt;/p&gt;

&lt;p&gt;30 unique keys per minute per user is generous for legitimate use and devastating for an attacker trying to fill your Redis.&lt;/p&gt;




&lt;h2&gt;
  
  
  A few more things worth getting right
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Skip unauthenticated requests.&lt;/strong&gt; Without an actor ID you can't scope keys. An "anonymous" bucket keyed by IP sounds reasonable until you remember that anyone behind the same NAT shares an IP - one user's cached response replays for another. Just skip idempotency for unauthenticated endpoints.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Hash multipart uploads carefully.&lt;/strong&gt; A 50MB file upload with the same idempotency key is almost certainly a retry. But hashing 50MB on every request is expensive. Hash only the non-file fields.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Rate limiter integration.&lt;/strong&gt; If a request is a cache hit (idempotent replay), don't count it against rate limits. The user isn't making a new request. They're recovering from a dropped connection. Penalizing retries is punishing correct client behavior.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Redis going down.&lt;/strong&gt; If your idempotency layer throws 500s when Redis is unavailable, a Redis blip takes down the entire API. Skip idempotency and let requests through instead. You lose duplicate protection for a few seconds. You don't lose your entire service.&lt;/p&gt;

&lt;p&gt;The full implementation is around 500 lines. Most of it is boring, defensive code - which is probably why so many production systems are still running the 40-line version.&lt;/p&gt;

</description>
      <category>webdev</category>
      <category>nestjs</category>
      <category>security</category>
      <category>typescript</category>
    </item>
  </channel>
</rss>
