DEV Community

날다람쥐
날다람쥐

Posted on

Part 2 : I got tired of wiring the same caching stack every project, so I built LayerCache

LayerCache in Production: 5 Patterns That Actually Save You

Part 2 of the LayerCache series. If you missed Part 1, start here — it covers the core problem and the basic setup.

The first post got more traction than I expected (200+ views in two days — thank you).

"OK, the basic setup makes sense. But what does it look like in a real service?"

This post answers that. I'll walk through five patterns I keep reaching for, and how LayerCache handles each one without you having to wire it yourself.


1. Stop Writing Cache Keys by Hand — Use wrap()

The most tedious part of caching is key management. You write a function, then you write a cached version of that function, then you keep the two in sync forever. Bugs love that gap.

wrap() closes it. It decorates a function directly, deriving the cache key from the arguments automatically:

import { CacheStack, MemoryLayer, RedisLayer } from 'layercache'

const cache = new CacheStack([
  new MemoryLayer({ ttl: 60 }),
  new RedisLayer({ client: redis, ttl: 3600 }),
])

// Before: you manage the key
async function getUser(id: number) {
  return cache.get(`user:${id}`, () => db.findUser(id))
}

// After: wrap() handles the key for you
const getUser = cache.wrap(db.findUser.bind(db), {
  keyPrefix: 'user',
  ttl: 60,
  tags: ['users'],
})

// Call it exactly like the original function
const user = await getUser(123)
Enter fullscreen mode Exit fullscreen mode

The key is derived from keyPrefix + JSON.stringify(args). For most use cases that's exactly what you want. You can override it with a custom keyResolver if you need something more specific — for example, to exclude certain arguments or normalize them first.


2. Observability Without a Sidecar

I've shipped services where the caching layer was a black box in production. Hit rates? No idea. Was Redis actually being used? Grep the logs and hope. Was L1 evicting the wrong things? Complete mystery.

LayerCache has metrics built in, and they're dead simple to pull:

const stats = cache.getStats()

console.log(stats)
// {
//   hits: 18432,
//   misses: 241,
//   hitRate: 0.987,
//   fetches: 241,
//   staleHits: 18,
//   stampedeDedupes: 7,
//   layers: [
//     { name: 'MemoryLayer', hits: 16100, misses: 2332, avgLatencyMs: 0.006 },
//     { name: 'RedisLayer',  hits: 2191,  misses: 141,  avgLatencyMs: 0.021 },
//   ]
// }
Enter fullscreen mode Exit fullscreen mode

Per-layer latency is tracked using Welford's online algorithm — no memory overhead from storing every sample.

If you're running Prometheus, there's a one-liner exporter:

import { createPrometheusExporter } from 'layercache'

const exporter = createPrometheusExporter(cache)

app.get('/metrics', (req, res) => {
  res.set('Content-Type', 'text/plain')
  res.send(exporter.export())
})
Enter fullscreen mode Exit fullscreen mode

And if you want OpenTelemetry traces showing exactly which layer served each request:

import { createOpenTelemetryPlugin } from 'layercache'
import { trace } from '@opentelemetry/api'

createOpenTelemetryPlugin(cache, trace.getTracer('my-service'))
Enter fullscreen mode Exit fullscreen mode

This uses event hooks under the hood — no method monkey-patching. After wiring it up, you'll see spans like layercache.get → L1 miss → L2 hit → backfill L1 in your trace explorer. Debugging a cache performance regression goes from "add logging, redeploy, wait" to just opening your trace UI.


3. Hot Keys Don't Need Fixed TTLs — Use Adaptive TTL

Here's a subtle production problem: your most popular page gets 50,000 hits a day. Your least popular page gets 3. With a fixed TTL, they both expire on the same schedule and both go back to the database.

The unpopular page is fine — it just misses. The popular page creates a brief window where every concurrent user hits the expiry simultaneously. Stampede prevention helps, but the smarter fix is to just not let hot keys expire as quickly in the first place.

Adaptive TTL automatically extends the TTL for hot keys, up to a ceiling you define:

new MemoryLayer({
  ttl: 30,
  adaptiveTtl: {
    enabled: true,
    maxTtl: 300,      // never cache beyond 5 minutes
    hitsPerStep: 10,  // ramp up every 10 hits
    stepMs: 30_000,   // each step adds 30 seconds
  }
})
Enter fullscreen mode Exit fullscreen mode

A key that gets hit 100 times gradually ramps its TTL toward maxTtl. A key that goes cold resets back to the base TTL. You never have to profile and hardcode special TTLs for hot keys.

Pair it with staleWhileRevalidate and hot keys become almost invisible from the user's perspective — they always get a value immediately, and the refresh happens in the background:

new MemoryLayer({
  ttl: 60,
  staleWhileRevalidate: 600,
  adaptiveTtl: { enabled: true, maxTtl: 300, hitsPerStep: 20, stepMs: 30_000 },
})
Enter fullscreen mode Exit fullscreen mode

4. Framework Middleware: Drop-In Caching for Existing Routes

You shouldn't have to rewrite route handlers to add caching. LayerCache ships middleware for the frameworks you're probably already using.

Express:

import { createExpressCacheMiddleware } from 'layercache'

app.get('/api/users',
  createExpressCacheMiddleware(cache, {
    ttl: 30,
    tags: ['users'],
    keyResolver: (req) => `users:${req.url}`,
  }),
  async (req, res) => {
    res.json(await db.getUsers())
  }
)
Enter fullscreen mode Exit fullscreen mode

Cached responses come back with an x-cache: HIT header — useful for debugging in staging without changing any application logic.

Fastify:

import { createFastifyLayercachePlugin } from 'layercache'

await fastify.register(createFastifyLayercachePlugin(cache, {
  statsRoute: '/cache-stats', // optional: expose metrics endpoint
}))

fastify.get('/api/products', async (request, reply) => {
  return fastify.cache.get('products:all', () => db.getProducts())
})
Enter fullscreen mode Exit fullscreen mode

tRPC:

import { createTrpcCacheMiddleware } from 'layercache'

const cachedProcedure = publicProcedure.use(
  createTrpcCacheMiddleware(cache, 'trpc', { ttl: 60 })
)

export const appRouter = router({
  getUser: cachedProcedure
    .input(z.object({ id: z.number() }))
    .query(({ input }) => db.findUser(input.id)),
})
Enter fullscreen mode Exit fullscreen mode

GraphQL resolver:

import { cacheGraphqlResolver } from 'layercache'

const resolvers = {
  Query: {
    user: cacheGraphqlResolver(
      cache,
      'gql:user',
      (_, { id }) => db.findUser(id),
      { ttl: 60, tags: ['users'] }
    ),
  },
}
Enter fullscreen mode Exit fullscreen mode

The pattern is the same across all of them: wrap the data-fetching part, leave the rest of your route/resolver untouched.


5. Cache Warming: Don't Let Your First Requests Be the Slowest

Cold starts are painful. You deploy, traffic hits, and for the first 30 seconds every request goes all the way to the database while the cache warms up organically. In a high-traffic service that's a visible latency spike right after every deploy.

Cache warming pre-populates your layers before your service starts accepting traffic:

// Define what to warm and in what order
await cache.warm([
  {
    key: 'config:global',
    fetcher: () => db.getGlobalConfig(),
    ttl: 300,
    priority: 1, // load first
  },
  {
    key: 'categories:all',
    fetcher: () => db.getAllCategories(),
    ttl: 600,
    priority: 2,
  },
  {
    // Warm a batch of known hot keys
    keys: topUserIds.map(id => `user:${id}`),
    fetcher: (key) => db.findUser(Number(key.split(':')[1])),
    ttl: 60,
    priority: 3,
  },
])

// Now the cache is warm — start accepting traffic
app.listen(3000)
Enter fullscreen mode Exit fullscreen mode

Lower priority number = loads first. LayerCache runs each priority group before moving to the next, so your most critical data is always ready first. If a fetcher fails during warm-up, it's skipped rather than crashing your startup.


Bonus: The Admin CLI Is More Useful Than It Looks

One thing I didn't cover in Part 1: the admin CLI. You don't need to write any code to inspect or manage a Redis-backed cache in a running environment.

# See overall hit/miss stats
npx layercache stats

# List keys matching a pattern
npx layercache keys --pattern "user:*"

# Invalidate all keys tagged with 'posts'
npx layercache invalidate --tag posts

# Delete a specific key
npx layercache delete user:123
Enter fullscreen mode Exit fullscreen mode

It's saved me more than once when debugging a production issue. Instead of writing a one-off script to peek at what's in the cache, you just run the CLI.


Wrapping Up

Part 1 covered why LayerCache exists. This post covered how to actually use it past the basics:

  • wrap() for zero-boilerplate function caching
  • Built-in metrics, Prometheus, and OpenTelemetry for observability
  • Adaptive TTL to stop treating hot keys and cold keys the same
  • Drop-in middleware for Express, Fastify, tRPC, and GraphQL
  • Cache warming so your first request after a deploy isn't your slowest

If any of this is useful, the best way to support the project is a ⭐ on GitHub — it genuinely helps other developers find it.

👉 github.com/flyingsquirrel0419/layercache

Questions, edge cases, or features you'd want to see? Drop them in the comments — the next post will probably be driven by whatever comes up there.

Top comments (1)

Some comments may only be visible to logged-in visitors. Sign in to view all comments.