DEV Community

날다람쥐
날다람쥐

Posted on

I got tired of wiring the same caching stack every project, so I built LayerCache

Every Node.js service I've worked on hits the same caching wall. It always starts the same way.

You add an in-memory cache. It's fast. Life is good.

Then you scale to multiple instances. Now each server has its own view of the data. Stale reads start showing up in production. So you add Redis. Now all your instances share the same cache. Problem solved — until you realize every single request is paying a Redis round-trip, even for data that barely changes.

So you bring back the in-memory layer on top of Redis. Now you have L1 (memory) and L2 (Redis). But what happens when a key expires and 200 requests hit at the same time? They all miss L1, all miss L2, and they all go straight to the database simultaneously. Cache stampede. Your DB is not happy.

You add stampede protection. Then Redis goes down one day, and your entire cache blows up instead of gracefully falling back. You add circuit breaking. Then you realize your memory caches across instances are now serving different data and you need a pub/sub invalidation bus to keep them in sync...

It never ends.

I've wired this stack more than once. It's not that any single piece is hard — it's that getting all of it working together correctly, with proper testing and production-grade reliability, takes real engineering time every time.

So I built LayerCache to do it once and stop repeating myself.


What it does

LayerCache stacks multiple cache layers (Memory → Redis → Disk) behind a single get() call.

import { CacheStack, MemoryLayer, RedisLayer } from 'layercache'
import Redis from 'ioredis'

const cache = new CacheStack([
  new MemoryLayer({ ttl: 60, maxSize: 1_000 }),
  new RedisLayer({ client: new Redis(), ttl: 3600 }),
])

const user = await cache.get('user:123', () => db.findUser(123))
Enter fullscreen mode Exit fullscreen mode

On a cache hit: serves the fastest available layer, then automatically backfills the layers above it. So if L1 is cold but L2 (Redis) has the value, L1 gets filled for the next request.

On a cache miss: the fetcher function runs exactly once, no matter how many requests are waiting. All concurrent callers get the same promise.

your request flood
       │
┌──────▼──────┐
│ L1 Memory   │  ~0.005 ms  ← serves from here if warm
│             │
│ L2 Redis    │  ~0.2 ms   ← falls through to here if L1 cold
│             │
│ L3 Disk     │  ~2 ms     ← optional persistent layer
│             │
│ Fetcher()   │             ← runs ONCE even under 100 concurrent requests
└─────────────┘
Enter fullscreen mode Exit fullscreen mode

Solving the stampede problem

In a benchmark with 75 concurrent requests hitting an expired key:

Origin fetches
No cache 375
LayerCache 5 (one per layer)

The local single-flight is handled by sharing an in-flight promise across concurrent callers. No mutex queue. No serialization.

For distributed environments — multiple Node.js processes or machines — RedisSingleFlightCoordinator extends this across instances using distributed locks.

import { RedisSingleFlightCoordinator } from 'layercache'

const cache = new CacheStack(layers, {
  singleFlightCoordinator: new RedisSingleFlightCoordinator({ client: redis }),
})
Enter fullscreen mode Exit fullscreen mode

In a test with 60 concurrent requests across multiple instances: 1 origin fetch total.


Keeping L1 caches in sync across instances

The classic problem with in-process memory caches in a multi-instance setup: if you invalidate a key on Server A, Servers B and C still serve the old value from their L1.

LayerCache solves this with a Redis pub/sub invalidation bus.

import { RedisInvalidationBus } from 'layercache'

const cache = new CacheStack(layers, {
  invalidationBus: new RedisInvalidationBus({
    publisher: redis,
    subscriber: new Redis(), // separate connection for sub
  }),
})

// invalidating on one instance flushes L1 on all instances
await cache.delete('user:123')
Enter fullscreen mode Exit fullscreen mode

When Redis dies

This is where a lot of hand-rolled caching setups break badly. LayerCache has two modes:

Strict mode (default): if any layer fails, the operation fails. Good when you need strong consistency guarantees.

Graceful degradation: failed layers are temporarily skipped. The cache keeps working by going directly to the fetcher.

const cache = new CacheStack(layers, {
  gracefulDegradation: { retryAfterMs: 10_000 },
})
Enter fullscreen mode Exit fullscreen mode

I tested this with 500ms of injected Redis latency (way above the 200ms command timeout):

Scenario Strict Graceful
L1 warm hit ✅ 0.065 ms ✅ 0.065 ms
L2 hit (Redis slow) ❌ timeout ✅ 201 ms (fell back to fetcher)
Cold miss (Redis slow) ❌ timeout ✅ 200 ms (fell back to fetcher)

L1 hot hits aren't affected at all since they never touch Redis.


Benchmark numbers

Ran on a single-core VM with real Docker-backed Redis.

Warm hit latency

layered (L1 hit):   0.005 ms avg  (1006x faster than no-cache)
memory only:        0.010 ms avg  ( 503x faster than no-cache)
no-cache:           5.030 ms avg
Enter fullscreen mode Exit fullscreen mode

HTTP throughput (autocannon, 40 connections, 8 seconds)

/layered:   16,211 req/s  —  1.9 ms avg latency
/memory:    16,031 req/s  —  1.9 ms avg latency
/nocache:      158 req/s  — 253.2 ms avg latency
Enter fullscreen mode Exit fullscreen mode

Memory pressure

With L1 capped at 25 keys and 180 unique keys inserted (256 KiB each), revisits served 0 origin refetches — the layer evicted correctly and Redis backed the misses.

Full benchmark methodology and raw output: docs/benchmarking.md


Other things it does

I don't want to just dump a feature list, but a few things worth calling out:

Tag invalidation — attach tags to keys and invalidate all of them at once:

await cache.set('post:42', post, { tags: ['posts', 'user:7'] })
await cache.invalidateByTag('user:7') // clears all keys tagged with user:7
Enter fullscreen mode Exit fullscreen mode

Stale-while-revalidate — return the cached value immediately, refresh in the background:

new MemoryLayer({ ttl: 60, staleWhileRevalidate: 300 })
Enter fullscreen mode Exit fullscreen mode

Framework middleware — drop-in for Express, Fastify, Hono, tRPC, GraphQL:

app.get('/api/users',
  createExpressCacheMiddleware(cache, {
    ttl: 30,
    tags: ['users'],
    keyResolver: (req) => `users:${req.url}`,
  }),
  async (req, res) => res.json(await db.getUsers())
)
Enter fullscreen mode Exit fullscreen mode

Admin CLI — inspect a live Redis-backed cache without writing code:

npx layercache stats
npx layercache keys --pattern "user:*"
npx layercache invalidate --tag posts
Enter fullscreen mode Exit fullscreen mode

Getting started

npm install layercache
Enter fullscreen mode Exit fullscreen mode

Memory-only (no Redis needed):

const cache = new CacheStack([
  new MemoryLayer({ ttl: 60 })
])

const data = await cache.get('key', () => fetchData())
Enter fullscreen mode Exit fullscreen mode

Full distributed setup:

import {
  CacheStack,
  MemoryLayer,
  RedisLayer,
  RedisInvalidationBus,
  RedisSingleFlightCoordinator,
} from 'layercache'

const redis = new Redis()

const cache = new CacheStack(
  [
    new MemoryLayer({ ttl: 60, maxSize: 10_000 }),
    new RedisLayer({ client: redis, ttl: 3600, compression: 'gzip' }),
  ],
  {
    invalidationBus: new RedisInvalidationBus({
      publisher: redis,
      subscriber: new Redis(),
    }),
    singleFlightCoordinator: new RedisSingleFlightCoordinator({ client: redis }),
    gracefulDegradation: { retryAfterMs: 10_000 },
  }
)
Enter fullscreen mode Exit fullscreen mode

The part I found most interesting to design was the stampede guard — specifically making sure concurrent callers share a promise rather than queueing through a mutex, and then extending that behavior across processes with Redis. Happy to dig into any of that if you're curious.

Top comments (0)