날다람쥐

Posted on Apr 9

Beyond Basic Caching: How layercache Eliminates Cache Stampedes in Node.js

#typescript #caching #performance #redis

Every Node.js developer knows the caching drill. You start with an in-memory Map, graduate to Redis when you scale horizontally, and eventually find yourself wiring up a fragile hybrid system that breaks in production at 2 AM.

I recently discovered layercache—a multi-layer caching toolkit that promises to handle the messy parts (stampede prevention, graceful degradation, distributed consistency) while keeping the API simple. But does it deliver?

I ran four comprehensive benchmark suites against real Redis instances to find out. Here are the results.

The Architecture: L1 + L2 + Coordination

layercache treats caching as a stack:

┌─────────────────────────────────────┐
│  L1 Memory  (~0.01ms, per-process)  │
│  L2 Redis   (~0.5ms, shared)        │
│  L3 Disk    (~2ms, persistent)      │
└─────────────────────────────────────┘

When you request a key, it checks L1 first, then L2, then your database. The clever part? All layers backfill automatically—if you hit L2, layercache populates L1 for the next request. If you hit the database, it writes to both layers.

But the real magic happens when 100 requests arrive simultaneously for the same expired key.

Benchmark 1: The Stampede Test

The "thundering herd" problem is where most caching libraries fail. When a popular key expires, 100 concurrent requests can trigger 100 database queries before the first one repopulates the cache.

I tested 75 concurrent requests across 5 runs (375 total requests) for a cold key:

Setup	Origin Fetches
No cache	375
Memory-only	5
Memory + Redis	5

Result: layercache's single-flight coordination ensured the fetcher ran exactly once per expiry round, not 75 times. The library creates a coordination lock in Redis (or memory) so that concurrent requests wait for the first fetcher to complete rather than hammering your database.

Latency under this stampede:

Mode	Avg Latency	P95
No cache	409ms	429ms
Memory-only	6.9ms	13.5ms
Layered	36.7ms	43.6ms

The layered case is slower than memory-only (it pays Redis coordination costs), but it preserves the critical property: your database only feels one request.

Benchmark 2: Real HTTP Throughput

Theory is nice, but what about real HTTP servers? I set up three Express routes—no cache, memory-only, and layered—and hit them with autocannon (40 connections, 8 seconds):

Route	Avg Latency	P97.5	Req/sec	Throughput
`/nocache`	249ms	271ms	161	57 KB/s
`/memory`	1.82ms	4ms	16,705	5.9 MB/s
`/layered`	1.74ms	4ms	17,184	6.1 MB/s

That's a 100x throughput increase with minimal latency difference between memory-only and Redis-backed layers. Once warmed, L1 memory serves the hot path while Redis provides the shared backing store for multi-instance deployments.

Benchmark 3: When Redis Goes Wrong

Production caches fail. I tested two failure modes:

Slow Redis (500ms latency injection)

Using a TCP proxy to add synthetic latency:

Scenario	Single Request	500 Concurrent	Amplification
L2 hit (strict)	501ms	515ms	1.03x
L2 hit (graceful)	501ms	512ms	1.02x

Key finding: Under slow Redis, wall-clock time stayed close to the single-request baseline even at 500 concurrent requests. The linearity ratio collapsed to ~0.002, meaning the batch completed far faster than a naive "latency × N" model would predict.

However, cold misses were brutal: With 500ms Redis latency, a cache miss took ~2.5s because it paid the slow Redis cost plus the fetch/write cost.

Dead Redis (complete outage)

I paused the Redis container with Docker:

Scenario	Success	Latency
Strict hot hit	✅	0.17ms
Graceful hot hit	✅	0.07ms
Strict cold miss	❌	Timeout (2000ms)
Graceful cold miss	❌	Timeout (2000ms)

Critical insight: gracefulDegradation did not turn a cold miss into a fast memory-only fallback when Redis was completely frozen. Hot L1 keys survived the outage beautifully (served from memory), but new or expired keys stalled until timeout.

Operational takeaway: Warm your critical keys before Redis has issues. Hot L1 traffic is your lifeline during Redis outages.

Benchmark 4: Memory Pressure and Eviction

What happens when L1 memory fills up? I set maxSize: 25 and inserted 180 unique 256KB payloads:

Metric	Value
Evictions	180
L1 Retained	25 (exactly maxSize)
Origin Fetches on Revisit	0
GC Pauses (max)	6.1ms

When revisiting the oldest keys (which were evicted from L1), they were seamlessly reloaded from Redis L2—not the origin. No cache stampede, no origin amplification.

The GC impact was measurable (36 events, 78ms total) but not catastrophic—max pause stayed at 6ms, far from stop-the-world territory.

Edge Cases: TTL Expiry and Distributed Systems

TTL Stampede Protection

I tested 40 concurrent requests hitting a key that just expired (TTL: 1s, waited 1.1s):

Mode	Fetch Count
Memory-only	5 (one per expiry round)
Layered	5 (one per expiry round)

Even with TTL expiry triggering simultaneously across multiple rounds, deduplication held firm.

Multi-Instance Consistency

Running two Node.js instances with shared Redis:

Invalidation Bus: When Instance A updated a key, Instance B's L1 cache was invalidated via Redis Pub/Sub within milliseconds.
Distributed Single-Flight: 60 concurrent requests across both instances for the same missing key resulted in exactly 1 origin fetch.

This is the holy grail for microservices: you get per-process L1 speed with cluster-wide consistency.

Payload Size Sensitivity

Does caching large objects hurt?

Setup	1KB Avg	1MB Avg	P95 (1MB)
Memory-only	0.012ms	0.018ms	0.023ms
Redis-only	0.200ms	4.170ms	10.11ms

Large payloads hurt only when Redis is on the hot path. Memory hits barely changed between 1KB and 1MB, but Redis hits jumped 20x due to serialization and network transfer. Keep your L1 maxSize generous for large objects.

Practical Takeaways

After running these benchmarks, here are my operational recommendations:

Use layered caching for multi-instance deployments. The hot-hit latency is identical to memory-only (~0.005ms), but you get distributed consistency and stampede prevention.
Warm your cache before traffic spikes. Cold misses under slow Redis are painful (~2.5s), and dead Redis won't gracefully degrade for new keys.
Set generous L1 limits for large payloads. 1MB objects in Redis are 200x slower than in memory. Let L1 absorb that cost.
Don't rely on graceful degradation for cold keys. It protects hot L1 traffic during outages, but new keys will still timeout.
Trust the stampede prevention. The library correctly handled 75→1 fetch reduction even with TTL expiry and cross-instance coordination.

Getting Started

Basic setup:

import { CacheStack, MemoryLayer, RedisLayer } from 'layercache'
import Redis from 'ioredis'

const cache = new CacheStack([
  new MemoryLayer({ ttl: 60, maxSize: 10_000 }),
  new RedisLayer({ client: new Redis(), ttl: 3600 })
])

// Automatic stampede prevention
const user = await cache.get('user:123', () => db.findUser(123))

For distributed deployments, wire up the invalidation bus:

import { RedisInvalidationBus, RedisSingleFlightCoordinator } from 'layercache'

const cache = new CacheStack([
  new MemoryLayer({ ttl: 60 }),
  new RedisLayer({ client: redis })
], {
  invalidationBus: new RedisInvalidationBus({ 
    publisher: redis, 
    subscriber: new Redis() 
  }),
  singleFlightCoordinator: new RedisSingleFlightCoordinator({ client: redis }),
  gracefulDegradation: { retryAfterMs: 10_000 }
})

The Verdict

layercache delivers on its promises. The benchmark data shows it handles the three hard problems of production caching—stampede prevention, graceful degradation, and distributed consistency—without sacrificing the performance of simple in-memory caching.

The 100x HTTP throughput improvement and zero-fetch stampede protection make it a strong candidate for any Node.js service moving beyond a single instance.

Have you solved cache stampedes differently? I'd love to hear your war stories in the comments.

Links:

npm: layercache
GitHub Repository
Benchmark environment: Node.js v20.20.1, Redis 7-alpine, Linux 5.15

DEV Community