<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Kareem</title>
    <description>The latest articles on DEV Community by Kareem (@kareem411).</description>
    <link>https://dev.to/kareem411</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3953842%2F3e23bbc0-3c35-4dde-9487-fddaa87821df.jpeg</url>
      <title>DEV Community: Kareem</title>
      <link>https://dev.to/kareem411</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/kareem411"/>
    <language>en</language>
    <item>
      <title>How We Built a 3-Tier Node.js Cache That Hits 2.8M ops/sec (And Never Drops a Key on Eviction)</title>
      <dc:creator>Kareem</dc:creator>
      <pubDate>Wed, 27 May 2026 07:56:29 +0000</pubDate>
      <link>https://dev.to/kareem411/how-we-built-a-3-tier-nodejs-cache-that-hits-28m-opssec-and-never-drops-a-key-on-eviction-1bkb</link>
      <guid>https://dev.to/kareem411/how-we-built-a-3-tier-nodejs-cache-that-hits-28m-opssec-and-never-drops-a-key-on-eviction-1bkb</guid>
      <description>&lt;p&gt;Caching in Node.js usually forces a frustrating compromise. You either stick with a basic in-memory &lt;code&gt;Map&lt;/code&gt; (or &lt;code&gt;LRU-cache&lt;/code&gt;) that risks running your process out of memory (OOM) under heavy loads, or you accept the network latency floor and serialization overhead of a standalone Redis/Valkey instance.&lt;/p&gt;

&lt;p&gt;When building high-throughput systems, neither option feels entirely right. A localhost Redis round-trip is fast, but it’s still bounded by networking stacks.&lt;/p&gt;

&lt;p&gt;We built &lt;strong&gt;&lt;code&gt;tricache&lt;/code&gt;&lt;/strong&gt; to eliminate this compromise completely. It’s an open-source, three-tier caching engine for Node.js designed to maximize raw single-threaded throughput while implementing protective guardrails like automated disk spilling, thundering-herd prevention, and an integrated WebAssembly Bloom filter.&lt;/p&gt;

&lt;p&gt;The result? &lt;strong&gt;2.81 million read operations per second&lt;/strong&gt; from a single thread—over 100× faster than a local Redis round-trip—without the risk of unbounded RAM growth.&lt;/p&gt;

&lt;p&gt;Here is a look under the hood at how it works and the architectural decisions that made it happen.&lt;/p&gt;




&lt;h2&gt;
  
  
  The 3-Tier Architecture
&lt;/h2&gt;

&lt;p&gt;To achieve high hit rates without risking OOM crashes, &lt;code&gt;tricache&lt;/code&gt; splits data management into three distinct layers:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;L1 (Smart Memory Cache):&lt;/strong&gt; A blazing-fast, process-level memory layer built on top of a highly optimized V8 &lt;code&gt;Map&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;L1.5 (Local Disk Spill):&lt;/strong&gt; When the L1 memory cap (&lt;code&gt;l1MaxBytes&lt;/code&gt;) or entry cap is reached, evicted entries aren't simply deleted. They spill over to a local, high-speed NVMe disk directory using optimized &lt;code&gt;msgpackr&lt;/code&gt; binary serialization.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;L2 (Distributed Backplane):&lt;/strong&gt; The shared remote tier (Redis or Valkey) used for multi-instance distributed sync.&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  Core Technical Features &amp;amp; Innovations
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. Advanced Hybrid Eviction via Count-Min Sketch
&lt;/h3&gt;

&lt;p&gt;Standard LRU (Least Recently Used) caching fails catastrophically during sudden scan floods or data bursts, where long-resident, high-value keys get wiped out by single-access keys.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;tricache&lt;/code&gt; implements a hybrid eviction algorithm that scores keys based on a combination of &lt;strong&gt;Priority Score + Access Frequency + Remaining TTL&lt;/strong&gt;. To track access frequency without introducing a massive memory footprint, we integrated a &lt;strong&gt;Count-Min Sketch&lt;/strong&gt; using a tiny 4 KB &lt;code&gt;Uint16Array&lt;/code&gt; (4 rows × 512 columns).&lt;/p&gt;

&lt;p&gt;Every time a key is queried or set, it passes through the sketch. When L1 needs to evict data, it uses reservoir sampling to pick an eviction candidate in a single $O(1)$ pass, ensuring a &lt;strong&gt;78% survival rate&lt;/strong&gt; for high-frequency keys during benchmark flood tests.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Inlined WASM Bloom Filter
&lt;/h3&gt;

&lt;p&gt;Before hitting a synchronous &lt;code&gt;Map&lt;/code&gt; lookup or checking the filesystem/Redis on a cache miss, queries pass through an ultra-lightweight &lt;strong&gt;WASM Bloom filter&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;The filter is optimized with $k=7$ hash probes. At a rated capacity of roughly 18,000 entries, it maintains a false-positive rate of just 1%. If the filter says a key does not exist, it is a guaranteed miss, allowing &lt;code&gt;tricache&lt;/code&gt; to skip heavy lookups entirely. The 562-byte WASM binary is fully inlined as a Base64 string—meaning &lt;strong&gt;zero filesystem access overhead&lt;/strong&gt; at boot time.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Native Thundering-Herd Prevention
&lt;/h3&gt;

&lt;p&gt;When a highly contested key expires under heavy traffic, hundreds of concurrent requests usually hit the underlying database at the exact same moment.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;tricache&lt;/code&gt; avoids this with a native &lt;strong&gt;Inflight Promise Registry&lt;/strong&gt;. No matter how many concurrent callers request an expired or missing key, the user-supplied &lt;code&gt;fetchFn&lt;/code&gt; fires &lt;strong&gt;exactly once&lt;/strong&gt;. All other concurrent requests hook into that single pending promise, protecting your database or upstream APIs from collapsing under sudden load.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. Enterprise-Grade Resiliency Out of the Box
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Stale-While-Revalidate (SWR) &amp;amp; Stale-If-Error:&lt;/strong&gt; Instantly serves stale data to the client while revalidating the data asynchronously in the background. If the upstream database fails during revalidation, &lt;code&gt;staleIfError&lt;/code&gt; kicks in to extend the stale entry's life, keeping your app fully operational during upstream outages.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;OOM Guard:&lt;/strong&gt; A background timer polls &lt;code&gt;heapUsed&lt;/code&gt; and &lt;code&gt;heapTotal&lt;/code&gt;. If memory utilization breaches a set threshold (e.g., 85%), an emergency eviction routine triggers immediately, clearing out the coldest 20% of L1 memory before the Node.js process crashes.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;At-Rest Encryption:&lt;/strong&gt; Supports automated AES-256-GCM authenticated encryption for L2 values, disk spill files, and cold-start snapshots, complete with zero-downtime key rotation support.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Performance Benchmarks
&lt;/h2&gt;

&lt;p&gt;All metrics were captured executing on a single Node.js thread (with synchronous paths completely un-awaited):&lt;/p&gt;

&lt;h3&gt;
  
  
  L1 Memory Cache Throughput
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Operation&lt;/th&gt;
&lt;th&gt;Throughput&lt;/th&gt;
&lt;th&gt;Latency&lt;/th&gt;
&lt;th&gt;Mechanics&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;&lt;code&gt;get&lt;/code&gt; (Hot Hit)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;2.81 M/s&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;~356 ns&lt;/td&gt;
&lt;td&gt;Bloom filter → Map lookup → Return&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;&lt;code&gt;get&lt;/code&gt; (Cold Miss)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;7.14 M/s&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;~140 ns&lt;/td&gt;
&lt;td&gt;Bloom filter gates early return&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;&lt;code&gt;delete&lt;/code&gt; (Exact Key)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;5.36 M/s&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;~186 ns&lt;/td&gt;
&lt;td&gt;Synchronous Map deletion&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  End-to-End Service Performance
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Operation&lt;/th&gt;
&lt;th&gt;Throughput&lt;/th&gt;
&lt;th&gt;Latency&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;&lt;code&gt;get&lt;/code&gt; (L1 Warm Hit)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;2.03 M/s&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;~491 ns&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;&lt;code&gt;get&lt;/code&gt; (SWR Stale Serve)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;1.78 M/s&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;~562 ns&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;&lt;code&gt;get&lt;/code&gt; (Miss + &lt;code&gt;fetchFn&lt;/code&gt;)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;13.7 K/s&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;~73 µs&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  Quick Start: Getting Started in 30 Seconds
&lt;/h2&gt;

&lt;p&gt;Getting started requires zero initial configuration. Sensible production defaults apply automatically.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npm &lt;span class="nb"&gt;install &lt;/span&gt;tricache
&lt;span class="c"&gt;# or pnpm add tricache&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;CacheService&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;tricache&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;// Initialize the process-level singleton&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;cache&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;CacheService&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;namespace&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;my-app&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;redisHost&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;REDIS_HOST&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="c1"&gt;// Omit in dev to automatically fallback to L1/Disk only&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="c1"&gt;// Fetch-or-Get with an automatic database fallback and a 5-minute TTL&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;user&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;cache&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="s2"&gt;`user:&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;userId&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;users&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;findById&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;userId&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="mi"&gt;300&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="c1"&gt;// TTL in seconds&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;swr&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;30&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="c1"&gt;// Seamless Stale-While-Revalidate for 30 seconds&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Advanced Group Invalidation with Tags
&lt;/h3&gt;

&lt;p&gt;You can tag items on creation and invalidate entire collections across L1, local disk, and distributed Redis nodes simultaneously:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Add items to the cache with descriptive tags&lt;/span&gt;
&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;cache&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`product:1`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;productData&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;300&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kc"&gt;undefined&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;tags&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;catalog&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;cache&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`product:2`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;alternativeData&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;300&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kc"&gt;undefined&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;tags&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;catalog&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="c1"&gt;// Atomically evict all catalog entries everywhere&lt;/span&gt;
&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;cache&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;invalidateTag&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;catalog&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Open Source &amp;amp; Contributing
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;tricache&lt;/code&gt; is fully open-source, licensed under the MIT license, and built entirely using TypeScript. If you are building high-performance Node.js services or want to dive into the codebase (especially &lt;code&gt;src/smart-memory-cache.ts&lt;/code&gt;), check out the repository!&lt;/p&gt;

&lt;p&gt;👉 &lt;strong&gt;Check out the repo on GitHub:&lt;/strong&gt; &lt;a href="https://github.com/Kareem411/TriCache#readme" rel="noopener noreferrer"&gt;github.com/Kareem411/TriCache&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Feedback, bug reports, and performance optimizations via Pull Requests are always welcome! Let me know what caching bottlenecks you're currently facing in your architecture below.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>node</category>
      <category>typescript</category>
      <category>performance</category>
      <category>backend</category>
    </item>
  </channel>
</rss>
