How to cut Node.js memory usage by 40% in self-hosted apps

#node #performance #selfhosted #debugging

I've been running a small fleet of self-hosted Node.js services on a 4GB VPS for the past three years. Last month, I hit a wall. The box was swapping constantly, my dashboard service alone was sitting at 380MB RSS, and PM2 kept restarting it every few hours when it crossed my memory cap.

After a weekend of profiling, I got that same service down to ~210MB steady state. That's roughly a 40% drop, and nothing about the user-facing behavior changed. Here's the debugging process I went through, and the patterns I'd reach for first next time.

The symptom: slow RSS growth that never plateaus

The tell-tale sign of a leak (versus normal heap growth) is that memory keeps climbing after the app reaches steady-state traffic. Healthy Node services breathe — they grow during request bursts, GC kicks in, they shrink back. Leaking services just grow.

First thing I did was log RSS every minute and graph it:

// memory-probe.js
setInterval(() => {
  const m = process.memoryUsage();
  // rss = total resident set, heapUsed = live JS objects
  console.log(JSON.stringify({
    t: Date.now(),
    rss: m.rss,
    heap: m.heapUsed,
    external: m.external,
    arrayBuffers: m.arrayBuffers,
  }));
}, 60_000);

If rss grows but heapUsed stays flat, your leak is outside the V8 heap — usually in Buffer allocations or native addons. If heapUsed itself grows, it's a regular old JS reference leak. Mine was the second kind, but I had a small contribution from the first too.

Root cause #1: closures holding request objects

The biggest single culprit was a logging middleware that did this:

app.use((req, res, next) => {
  const start = Date.now();
  res.on('finish', () => {
    // This closure captures `req` until the response is GC'd
    logger.info({ url: req.url, ms: Date.now() - start, body: req.body });
  });
  next();
});

Looks fine. It's not. The finish listener closes over the entire req object, including the parsed body. For most requests that's a few KB. For our file-upload endpoint, req.body held a base64-encoded payload that could hit 8MB. The listener wasn't the leak — res.on('finish', ...) does fire and the closure does become collectable — but under load we had hundreds of these in flight simultaneously, and V8's incremental GC fell behind.

The fix is to capture only what you need:

app.use((req, res, next) => {
  const start = Date.now();
  // Snapshot the primitives now, drop references to req/body
  const url = req.url;
  const method = req.method;
  res.on('finish', () => {
    logger.info({ url, method, ms: Date.now() - start, status: res.statusCode });
  });
  next();
});

This one change shaved ~60MB off our p95 RSS. Lesson: any callback registered on a long-lived event emitter is a memory liability if it closes over heavy objects.

Root cause #2: oversized dependency footprint

The second hit came from auditing what was actually loaded into memory. I used a heap snapshot from Chrome DevTools (connect via node --inspect, then take a snapshot in the Memory tab) and was shocked at how much was sitting in (compiled code) and (string) retainers.

A lot of self-hosted Node apps inherit dependency bloat from the framework template they started with. Run this to see what you've actually got:

# Show every transitive dep and its install size
npx howfat -r tree

# Or quickly find the biggest offenders in node_modules
du -sh node_modules/* | sort -h | tail -20

In my case, I was pulling in moment (transitively, via a date-picker lib I wasn't even using server-side), a full lodash import, and axios in three different versions. Swapping moment for date-fns, switching to lodash-es with proper tree-shaking on the bundle, and de-duping axios via a resolution override knocked another ~40MB off steady-state.

The under-appreciated piece here: every required module stays loaded in memory for the lifetime of the process. Node's module cache is essentially permanent. If you're requiring a 12MB library at boot to use one function, you pay 12MB forever.

Root cause #3: V8 heap configured for a beefier box

Node's V8 heap defaults are surprisingly generous, and they grow lazily based on what the OS appears to offer. On a small VPS, V8 will happily expand its old-generation heap to hundreds of MB before triggering a serious GC sweep, because it doesn't know your budget.

You can pin the ceiling:

# Cap the old-generation heap at 256MB
node --max-old-space-size=256 server.js

This forces V8 to GC more aggressively when it nears the cap. There is a CPU cost — more frequent collections, slightly worse p99 latency — but on a memory-constrained host that's a trade you usually want to make. I run mine at 256MB and CPU went up about 4% on average, which I'll take in exchange for not OOM-killing twice a day.

A related setting worth knowing about: --max-semi-space-size controls the young-generation size. Bumping it up to 64MB on services with lots of short-lived allocations actually reduces total memory pressure because objects die in young-gen before getting promoted to old-gen, where they're more expensive to collect.

Prevention: make memory a CI concern

A few habits that have kept regressions from sneaking back in:

Smoke-test memory in CI. Run your app under load for 60 seconds, capture peak RSS, fail the build if it climbs more than 15% from baseline. Doesn't need to be fancy.
Forbid unbounded caches. Any in-memory Map or object used as a cache must have a size cap or a TTL. A Map with no eviction is a leak with a delay timer.
Audit node_modules size on every PR. A bot that comments installed size went from 142MB to 168MB is annoying enough that people actually look into why.
Use --inspect snapshots before declaring a leak fixed. Two snapshots, one before and one after your reproduction load — the diff view tells you exactly which constructors retained objects across the window.

None of this is rocket science. It's mostly the boring discipline of treating memory like any other budget. The 40% I clawed back wasn't from one clever trick — it was from three independent fixes, each worth 10-15% on their own, stacking. That's almost always how it goes.