<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Chaitanya Srivastav</title>
    <description>The latest articles on DEV Community by Chaitanya Srivastav (@chaitanya_srivastav_9bd5a).</description>
    <link>https://dev.to/chaitanya_srivastav_9bd5a</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3153068%2F73e2580a-0ad7-478f-ae73-2ba0ba92e761.png</url>
      <title>DEV Community: Chaitanya Srivastav</title>
      <link>https://dev.to/chaitanya_srivastav_9bd5a</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/chaitanya_srivastav_9bd5a"/>
    <language>en</language>
    <item>
      <title>Why idempotency implementation is probably broken under concurrent load</title>
      <dc:creator>Chaitanya Srivastav</dc:creator>
      <pubDate>Thu, 02 Apr 2026 05:03:53 +0000</pubDate>
      <link>https://dev.to/chaitanya_srivastav_9bd5a/why-your-idempotency-implementation-is-probably-broken-under-concurrent-load-5b22</link>
      <guid>https://dev.to/chaitanya_srivastav_9bd5a/why-your-idempotency-implementation-is-probably-broken-under-concurrent-load-5b22</guid>
      <description>&lt;p&gt;Most idempotency implementations are quietly broken under concurrent load.&lt;/p&gt;

&lt;p&gt;They work fine in testing. They even work in staging.&lt;br&gt;
Here's why, and what actually needs to happen.&lt;/p&gt;
&lt;h2&gt;
  
  
  The problem
&lt;/h2&gt;

&lt;p&gt;Network requests fail and get retried. Your payment client times out and fires again. A webhook gets delivered twice. Your order creation endpoint runs twice and charges the customer twice.&lt;/p&gt;

&lt;p&gt;The naive fix looks like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="nx"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;/orders&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;key&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;idempotency-key&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;key&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;next&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;existing&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;store&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;key&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;existing&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;existing&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;createOrder&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;body&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;store&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;result&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This looks correct. It isn't.&lt;/p&gt;

&lt;h2&gt;
  
  
  The race condition
&lt;/h2&gt;

&lt;p&gt;Under concurrent load, two retries with the same key arrive simultaneously:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Thread A: store.get(key) → null   (key doesn't exist yet)
Thread B: store.get(key) → null   (key doesn't exist yet)
Thread A: createOrder()           (runs)
Thread B: createOrder()           (also runs — double charge)
Thread A: store.set(key, result)
Thread B: store.set(key, result)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Both requests pass the &lt;code&gt;get()&lt;/code&gt; check before either has written to the store. Both execute the handler. Both charge the customer.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;store.get()&lt;/code&gt; is atomic — no thread sees it half-done. But atomicity of a single operation doesn't prevent race conditions &lt;em&gt;between&lt;/em&gt; operations. The race lives in the gap between &lt;code&gt;get()&lt;/code&gt; returning null and &lt;code&gt;set()&lt;/code&gt; being called.&lt;/p&gt;

&lt;h2&gt;
  
  
  The fix: collapse check-and-set into one atomic operation
&lt;/h2&gt;

&lt;p&gt;Redis gives you this for free:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;SET key value NX EX ttlSeconds
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;NX&lt;/code&gt; means "only set if the key doesn't exist." &lt;br&gt;
&lt;code&gt;EX&lt;/code&gt; sets expiry. &lt;br&gt;
Most critically, this is a single atomic command. There is no gap. Only one caller wins across any number of concurrent processes or server instances.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="nf"&gt;acquire&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;ttlSeconds&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;redis&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="nx"&gt;key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="nx"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stringify&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;status&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;processing&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;}),&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;NX&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;EX&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;ttlSeconds&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;OK&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;  &lt;span class="c1"&gt;// 'OK' = won, null = lost&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now only one request executes the handler. Every other concurrent duplicate hits &lt;code&gt;acquire()&lt;/code&gt; returning false and gets a &lt;code&gt;409&lt;/code&gt; in-progress response with a &lt;code&gt;Retry-After&lt;/code&gt; header.&lt;/p&gt;

&lt;h2&gt;
  
  
  What happens if the process crashes between acquire and set?
&lt;/h2&gt;

&lt;p&gt;The lock is stuck as "processing." The next retry arrives and sees a processing lock — it returns 409 and waits. But the handler that acquired the lock is dead. The retry waits forever.&lt;/p&gt;

&lt;p&gt;This is why the lock has a TTL (&lt;code&gt;processingTtl&lt;/code&gt;). If the process crashes, the lock auto-expires after 30 seconds and the next retry re-acquires cleanly. Set &lt;code&gt;processingTtl&lt;/code&gt; higher than your p99 handler latency so the lock doesn't expire on a slow-but-alive request.&lt;/p&gt;

&lt;h2&gt;
  
  
  The key reuse problem
&lt;/h2&gt;

&lt;p&gt;There's another failure mode: a client reuses the same idempotency key for a different request.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;POST /orders  Idempotency-Key: abc123  body: &lt;span class="o"&gt;{&lt;/span&gt; item: &lt;span class="s2"&gt;"keyboard"&lt;/span&gt; &lt;span class="o"&gt;}&lt;/span&gt;  → 201
POST /orders  Idempotency-Key: abc123  body: &lt;span class="o"&gt;{&lt;/span&gt; item: &lt;span class="s2"&gt;"mouse"&lt;/span&gt; &lt;span class="o"&gt;}&lt;/span&gt;     → ?
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Without validation, the second request silently returns the cached keyboard order. The client thinks they ordered a mouse. This should be rejected.&lt;/p&gt;

&lt;p&gt;The fix is &lt;code&gt;fingerprinting&lt;/code&gt; — hash the request shape and store it alongside the response. On every duplicate, compare the incoming fingerprint to the stored one. A mismatch means key reuse:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;POST /orders &lt;span class="o"&gt;{&lt;/span&gt; item: &lt;span class="s2"&gt;"mouse"&lt;/span&gt; &lt;span class="o"&gt;}&lt;/span&gt; with key abc123
→ 422 idempotency_key_mismatch: This key was used with a different request. Use a new key.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Putting it together
&lt;/h2&gt;

&lt;p&gt;Here's the full execution flow of a correct idempotency implementation:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;acquire(key) 
  ├── true  → execute handler → set(completed response)
  │                          └── if handler throws → release(key)
  └── false → get(key)
                ├── completed  → validate fingerprint → serve cached response
                └── processing → 409 + Retry-After header
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is what &lt;a href="https://github.com/chaitanyasrivastav/reliability-kit" rel="noopener noreferrer"&gt;reliability-kit&lt;/a&gt; implements — a production-grade idempotency middleware for &lt;a href="https://www.npmjs.com/package/@reliability-tools/express" rel="noopener noreferrer"&gt;Express&lt;/a&gt; and &lt;a href="https://www.npmjs.com/package/@reliability-tools/fastify" rel="noopener noreferrer"&gt;Fastify&lt;/a&gt;, with pluggable store backends so you can use Redis, Postgres, DynamoDB, or any backend that supports a conditional write.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npm &lt;span class="nb"&gt;install&lt;/span&gt; @reliability-tools/express
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;reliability&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;RedisStore&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;@reliability-tools/express&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="nx"&gt;Redis&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;ioredis&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;

&lt;span class="nx"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;use&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;reliability&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;idempotency&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;enabled&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;store&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;RedisStore&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Redis&lt;/span&gt;&lt;span class="p"&gt;()),&lt;/span&gt;
    &lt;span class="na"&gt;ttl&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;86400&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;fingerprintStrategy&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;full&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;// validates method + path + body&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;}))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The concurrent duplicate race, the stuck lock after a crash, the key reuse fingerprint check — all handled. The handler just handles the request.&lt;/p&gt;

</description>
      <category>node</category>
      <category>api</category>
      <category>backend</category>
      <category>distributedsystems</category>
    </item>
  </channel>
</rss>
