Ayush

Posted on Aug 24

📈Scaling Node.js: Core Principles for High-Throughput Data Pipelines

#node #kafka #performance #microservices

At Biofourmis, we handle a wide range of patient data, from low-frequency episodic submissions to high-throughput, continuous streams of data arriving at sub-second frequencies.

For a particularly challenging use case, I was tasked with creating a pipeline to handle thousands of concurrent patients, each sending a 150 KB data blob every 10 seconds. This pipeline also needed to withstand bursts of up to 8 calls per second per patient. All of this had to be achieved with an eye on maintaining low costs, both in terms of infrastructure and developer hours.

Given our team's expertise is primarily in Node.js, I set out to create and test the limits of what a platform made with Node.js could reliably handle. I also decided against direct Kafka broker exposure for now, as I wanted a service that could be independently modified and easily extended to handle other ingestion use cases in the future.

TL;DR 🚀

The Requirement🤯: To build a high-throughput Node.js service to handle massive patient data streams (150 KB blobs) from thousands of concurrent devices up to 700 calls/sec, while keeping costs low.
The Solution✨: My approach combined strategic tool choices with a deep understanding of Node.js and the V8 engine:
- Fastify: Chosen for its superior speed and low overhead compared to Express, which is due to its schema-based approach that helps V8 optimize code from the start.
- Streams: Prevented event loop from being blocked by using json-stream-stringify to process large JSON payloads in chunks, maintaining high concurrency.
- node-rdkafka: Chose this battle-tested wrapper on top of native C/C++ librdkafka library for its exceptional throughput over other clients.
- V8Memory Optimizations: Prevented V8's hidden class de-optimizations by ensuring objects adhered to a fixed schema and handled responsibly. Also used flatstr library to make strings more memory-efficient before being converted to a Buffer.
Result and Analysis✅: The service proved to be highly efficient and cost-effective:
- Peak Load: Handled up to 700 calls/sec during stress tests.
- Normal Resource Usage: Stayed low at 90-125 MB of memory and only 50-150 millicores of CPU.

This approach demonstrated that Node.js, when architected with performance in mind, can be a powerful and viable option for building high-throughput data pipelines.

The Requirement

The service's core job was to perform a simple, high-frequency task:

API call -> Validate data -> Serialize Data -> Produce to Kafka

The Solution: A Deep Dive into Key Components

Here's how I built a Node.js service that not only meets these demanding requirements but also remains incredibly efficient.

1. Fastify: The High-Speed Gateway⚡

Fastify is a web framework known for its speed and low overhead. Fastify's core design philosophy leverages V8 optimizations that are often overlooked, allowing it to achieve up to 4x the throughput of Express.

Fastify's schema-based approach is one of the key factors behind its performance, as I'll explain further when we discuss memory optimization.

2. Streams: Event Loop's Best Friend🌊

A common pitfall with Node.js is blocking the event loop. Operations like JSON.stringify() on a large object can be computationally expensive and may lead to dropped requests in a high-concurrency environment.

To avoid this, I used streams with the json-stream-stringify library. Instead of processing the entire JSON object at once, this library serializes the data in chunks, ensuring the event loop remains free to handle incoming requests.

let stringjson = "";
const jsonStream = new JsonStreamStringify(very-large-payload);
jsonStream.once("error", (err) => {
    reject(CustomError);
    // delete message ID from message tracker
});
jsonStream.on("data", (chunk) => {
    stringjson += chunk;
});
jsonStream.on("end", () => {
    try {
        flatstr(stringjson);
        // convert to Buffer and produce to Kafka
        resolve();
    } catch (err) {
        rej(err);
    }
});

3. node-rdkafka: The Battle-Tested Connector💪

For the Kafka integration, a client was needed that could handle the immense throughput. While KafkaJS is a popular choice, my tests showed that node-rdkafka was superior for this particular use case.

node-rdkafka is a high-performance Node.js client that wraps librdkafka, Apache Kafka's native C/C++ library. This native binding provides exceptional speed and efficiency.

A drawback is that it can occasionally be challenging to install and develop with locally, but for a throughput-intensive application, the performance gains are well worth it.

4. Memory Optimization: The Flatstr and Pass-by-Reference Approach🧠

To keep the service lean and cost-effective, I focused on memory optimization next.

V8's Hidden Classes: A key insight was understanding V8's hidden classes. Every time you alter an object by adding a new property, V8 creates a new hidden class, which can severely de-optimize your code. Ensuring that objects adhered to a pre-defined schema, were not unnecessarily copied, and were passed by reference to maintain a predictable, optimized object structure helped immensely with performance & resource optimization.

Fastify's schema-based approach handles this well and is a key factor to its performance. By defining the data structures upfront with JSON Schema, it allows the V8 engine to create optimized hidden classes for objects, avoiding costly runtime de-optimizations.

Flatstr: I also used the flatstr library. This module helps with a specific V8 optimization issue: when a new string is created by concatenating others, it leads to bloated memory-inefficient objects, internally represented as a tree. flatstr prunes these internal structures, making the string into a compact array before it's processed by an external streaming source or converted to a Buffer, thereby saving precious memory and compute resources.

Results and Analysis 📊

The results of this approach were highly encouraging. The service's resource usage remains incredibly low, making it an exceptionally cheap service to run.

Normal Load: 10 calls/sec
Peak Load Observed: 230 calls/sec
Stress Test: Up to 700 calls/sec
Memory Usage: 90 MB (normal) to 125 MB (peak)
CPU Usage: 50 millicore (normal) to 150 millicore (peak)

Conclusion ✨

In this project, I set out to prove that Node.js, when architected with a deep understanding of its event loop and the V8 engine, is a powerful and viable platform for high-throughput, data-intensive workloads. By strategically choosing tools like Fastify and node-rdkafka and applying key optimizations with streams and flatstr, I was able to build an incredibly efficient and cost-effective service.

Ultimately, the metrics speak for themselves: the service handles hundreds of requests per second with minimal resource consumption. Node.js is a robust and highly capable choice for some complex, real-world data pipelines.

DEV Community