DEV Community

Cover image for Why I Ripped stream.pipe() Out of My Node.js API Gateway
Ashish Barmaiya
Ashish Barmaiya

Posted on • Originally published at ashishbarmaiya.hashnode.dev

Why I Ripped stream.pipe() Out of My Node.js API Gateway

When I started building Torus, a multi-core Layer 7 Edge API Gateway from scratch in Node.js, I handled incoming network requests the way I had always seen it done in standard web applications:

TypeScript

let body = '';
req.on('data', (chunk: Buffer) => {
  body += chunk.toString();
});
req.on('end', () => {
  forwardToBackend(body);
});
Enter fullscreen mode Exit fullscreen mode

It worked perfectly for lightweight tests. But as I started pushing concurrent loads and larger payloads through the proxy, my server began to choke. CPU usage spiked to 100%, the event loop lagged, and memory consumption grew uncontrollably until the process crashed.

I had fallen into a classic architectural trap: I was dragging raw TCP payload bytes directly into the V8 JavaScript engine's memory heap.

Because the V8 heap has a strict memory limit, pulling massive payloads into user-space memory forces the Node.js Garbage Collector (GC) to work overtime. The GC halts the single-threaded event loop to clean up the allocated memory, effectively stalling every other active network connection in the proxy.

I learned a fundamental rule of proxy engineering the hard way:

Proxies shouldn't read data; they should just move it.

To build a production-grade gateway, I realized I had to bypass the V8 heap entirely. I needed to keep the data in raw C++ memory blocks and move it to the Operating System level. But as I refactored my routing logic to achieve this, I stumbled into a silent, catastrophic flaw in the standard Node.js stream API that brought my entire test suite to a halt.

The First Evolution: Bypassing V8 with .pipe()

The architectural fix required a fundamental shift in how I viewed the data. I had to stop treating payloads as static variables and start treating them as flowing water.

I didn't need to load an entire 50MB file into memory before forwarding it. I only needed to hold a few kilobytes in a temporary buffer, flush it to the destination, and reuse that memory space.

In Node.js, this is exactly what the node:stream module and Buffer objects are designed for.

A Buffer in Node.js allocates memory outside the V8 JavaScript engine. It utilizes raw C++ memory blocks mapped directly to the OS. By keeping the network chunks as raw Buffers, the payload never enters the JavaScript heap. Because it never enters the heap, the V8 Garbage Collector completely ignores it, leaving the event loop free to handle other connections.

To wire this up, I utilized the native .pipe() method to connect the readable client stream to the writable backend stream:

TypeScript

// Connects the incoming ReadableStream directly to the outgoing WritableStream
clientReq.pipe(proxyReq);
Enter fullscreen mode Exit fullscreen mode

This single line of code acts as an OS-level plumbing system. It takes the incoming TCP stream, reads the raw C++ buffers, automatically manages the backpressure (ensuring a fast client doesn't overwhelm a slow backend connection), and pushes the bytes directly out to the routing pool.

My CPU usage plummeted. The memory footprint stayed flat, regardless of how large the incoming payloads were. It felt like I had solved the scaling problem entirely.

But .pipe() was hiding a massive, silent vulnerability.

The Plot Twist: The Silent Socket Leak

I thought I had engineered the perfect solution. The proxy was fast, the CPU was idle, and the memory footprint stayed completely flat, regardless of payload size.

Then I ran my integration test suite.

All the assertions passed. I got the green checkmarks. But the terminal just froze. Jest refused to exit, eventually spitting out that infuriating warning: "Jest did not exit one second after the test run has completed."

My initial reaction was to treat it like a standard web app bug. I meticulously checked my teardown logic, making sure proxyServer.close() was being called and my Redis clients were fully disconnected. I ran the tests again. It still hung.

I had to drop down to the OS level to understand what was actually happening. The Node.js event loop is mathematically programmed to never exit as long as there is an active I/O handle (like a net.Socket) in its queue. Something was keeping a socket alive.

The culprit was .pipe().

When my Jest test fired a dummy request through the proxy and then disconnected, the client-side socket closed gracefully. But I learned a lesson about Node.js streams: .pipe() blindly pushes data; it does not propagate lifecycle events.

When the client dropped, .pipe() did not send an error or close event to the destination stream. It left the backend connection completely open. The proxy was sitting there holding a dead connection to the backend, waiting for network bytes that would never arrive.

I had built a machine that generated half-open sockets. In a production environment, this would silently exhaust the Operating System's File Descriptors (FDs). Every dropped client connection would permanently lock up an FD until the OS hit its limit and violently crash the Node process with an EMFILE error.

The Fix: stream.pipeline()

The Node.js core maintainers knew .pipe() was dangerously naive for production infrastructure. That is exactly why they introduced stream.pipeline().

Instead of blindly shoving data from one socket to another, .pipeline() acts as a unified state machine that monitors the entire stream chain. It pushes the responsibility of socket teardown back to the Node.js core networking stack where it belongs.

If any stream in the pipeline fails, throws an error, or abruptly closes (like a client dropping off with an ECONNRESET), .pipeline() automatically intercepts it. It destroys all other connected streams in that specific chain and bubbles up a single error for you to catch.

I removed every instance of .pipe() and the dozens of lines of manual .on('error') spaghetti I had written. Because raw TCP proxying requires bidirectional data flow, I replaced it with two parallel, sequential pipelines:

TypeScript

try {
  await Promise.all([
    pipeline(clientSocket, backendSocket),
    pipeline(backendSocket, clientSocket)
  ]);
} catch (err: any) {
  // If either side drops, the pipeline throws, and we clean up natively.
  clientSocket.destroy();
  backendSocket.destroy();
}
Enter fullscreen mode Exit fullscreen mode

The moment I swapped to this architecture and ran my integration suite, the terminal didn't hang. Jest executed all 18 network tests and exited flawlessly in 2.7 seconds. The event loop was instantly cleared. The silent socket leak was completely eradicated.

Terminal screenshot showing 18 Jest network tests passing flawlessly in 2.7 seconds.

Conclusion: Trust Nothing, Understand Everything

Building a multi-core Edge Gateway from scratch taught me that I cannot blindly trust high-level abstractions.

stream.pipe() looks elegant in a standard web tutorial, but in the trenches of raw TCP networking, it is a massive liability. If you are building infrastructure that handles thousands of concurrent connections, you must understand the Operating System-level lifecycle of your File Descriptors and your sockets. If you don't, your system will slowly bleed to death under load, and logs won't even tell you why.

If you want to see the exact implementation of this bidirectional TCP routing, you can check out the raw source code for Torus Proxy on my GitHub.

Top comments (1)

Collapse
 
acytryn profile image
Andre Cytryn

the EMFILE trap is one of those things that only shows up under real load, never in unit tests. the half-open socket problem with .pipe() is especially nasty because everything looks fine until your FD table is exhausted. the bidirectional pipeline approach you landed on is solid. one thing worth noting: in some proxy scenarios you also want to handle backpressure asymmetry, where the client reads slower than the backend writes, otherwise you can silently buffer unbounded data even with pipeline. did you run into that case with Torus?