DEV Community

Cover image for Scaling Node.js: Handling 1 Million Requests Like a Pro
Fahim Hasnain Fahad
Fahim Hasnain Fahad

Posted on

Scaling Node.js: Handling 1 Million Requests Like a Pro

JavaScript is a single threaded, synchronous language. This means when users request to a NodeJS server, it runs on one Core of CPU of the server and doesn't use other cores.
NodeJS Single threaded approach

Requirements

  1. Node application that can handle millions of requests in a very short time.
  2. Each request's response time shouldn't go over a threshold of 500ms.
  3. System must be fault tolerant.

I found lots of ways to handle this but found really a simple solution with NodeJS's own Cluster module.

It uses all cores of the CPU of server and launches multiple instances of Node application in the same port. There is one main cluster which forks child processes (instances in other cores) and channels requests to the child processes. The child processes handles these requests as a standalone instance.

NodeJS Cluster Module

Code Implementation

Create Express application and install dependencies:

cd project-name
npm init -y 
npm i express mongoose nodemon dotenv
Enter fullscreen mode Exit fullscreen mode

General Approach

Then in the index.js file, create an usual express application.

require('dotenv').config();
const express = require('express');
const connectDB = require('./config/database');

const app = express();

app.use(express.json());

connectDB();

app.use("/api/items", require("./routes/items"));

app.get("/", (req, res) => {
  res.json({ message: "Welcome to the API" });
});

const PORT = process.env.PORT || 8000;
app.listen(PORT, () => {
  console.log(`Worker ${process.pid} started - Server running on port ${PORT}`);
});

Enter fullscreen mode Exit fullscreen mode

Now, run the application using nodemon index.js command.

Then, install loadtest package globally for load testing:

npm install -g loadtest
Enter fullscreen mode Exit fullscreen mode

Now run the loadtest on the local server using this command:

loadtest -n 1000000 --rps 10000 http://localhost:8000 #use your port number
Enter fullscreen mode Exit fullscreen mode

This command creates 1M reuests with maximum 10K rps (request per second).

This is the result:

Target URL:          http://localhost:8000
Max requests:        1000000
Target rps:          10000
Concurrent clients:  3805
Running on cores:    6
Agent:               none

Completed requests:  1000000
Total errors:        43446
Total time:          539.523 s
Mean latency:        97.7 ms
Effective rps:       1853

Percentage of requests served within a certain time
  50%      10 ms
  90%      265 ms
  95%      543 ms
  99%      1450 ms
 100%      4509 ms (longest request)

   -1:   43446 errors
Enter fullscreen mode Exit fullscreen mode

We can see, almost 44K requests failed to respond.

Cluster Module

Let's use NodeJS Cluster Now.

In the index file, change the code in Cluster:

require('dotenv').config();
const cluster = require('cluster');
const numCPUs = require('os').cpus().length;
const express = require('express');
const connectDB = require('./config/database');

if (cluster.isMaster) {
    console.log(`Master ${process.pid} is running`);
    console.log(`Number of CPUs: ${numCPUs}`);

    // Fork workers
    for (let i = 0; i < numCPUs; i++) {
        cluster.fork();
    }

    cluster.on('exit', (worker, code, signal) => {
        console.log(`Worker ${worker.process.pid} died`);
        cluster.fork(); // Replace the dead worker
    });
} else {
    const app = express();

    // Middleware
    app.use(express.json());

    // Connect to MongoDB
    connectDB();

    // Routes
    app.use('/api/items', require('./routes/items'));

    // Basic route
    app.get('/', (req, res) => {
        res.json({ message: 'Welcome to the API' });
    });

    const PORT = process.env.PORT || 8000;
    app.listen(PORT, () => {
        console.log(`Worker ${process.pid} started - Server running on port ${PORT}`);
    });
}

Enter fullscreen mode Exit fullscreen mode

Here the main Cluster forks the child cluster and uses all available Cores of the CPU.
When we run the application, we can see:

NodeJS process run
We can see it uses all the cores of the server.

Now, let's run the loadtest. We'll see the result like this:

Load test NodeJS Cluster

We can see, it got 100% success rate which is crazy! Also, the mean latency came down to 9.9ms from 97.7ms which is another crazy efficiency. Also, 99% of the requests responded success under 200ms where it took 1.5s previously.

Find the whole codebase in this repo: [https://github.com/fahadfahim13/node-scale.git](https://github.com/fahadfahim13/node-scale.git)

Top comments (0)