DEV Community

Cover image for How Node.js Handles Multiple Requests with a Single Thread
Pratham
Pratham

Posted on

How Node.js Handles Multiple Requests with a Single Thread

One chef, one kitchen, a hundred orders — and nobody's food is late.


If I told you a restaurant had only one chef handling a hundred orders simultaneously — and every order came out on time — you'd think I was lying. One person can't cook a hundred dishes at once. That's physically impossible.

But what if the chef wasn't doing all the cooking? What if the chef's job was to manage the kitchen — start dishes, delegate tasks to ovens and timers, and plate food when it's ready? Suddenly, one chef handling a hundred orders doesn't sound crazy. It sounds efficient.

That's exactly how Node.js works. One thread. Thousands of requests. No thread-per-request overhead. And it works because Node.js doesn't try to do everything itself — it delegates and coordinates.

This was one of those concepts in the ChaiCode Web Dev Cohort 2026 that seemed counterintuitive at first but made perfect sense once I saw it in action. Let me break it down.


Thread vs Process — The Basics

Before we dive in, let's clarify two terms you'll see everywhere:

Process — an entire running program. When you run node app.js, that's a process. It has its own memory, its own resources.

Thread — a unit of execution within a process. A process can have one thread (single-threaded) or many threads (multi-threaded).

Process (your Node.js app)
┌──────────────────────────────┐
│                              │
│   Thread (main)              │
│   → Your JavaScript runs     │
│     here. ONE at a time.     │
│                              │
│   Memory, variables, code    │
│   — all belong to this       │
│     process                  │
│                              │
└──────────────────────────────┘
Enter fullscreen mode Exit fullscreen mode

Traditional servers like Java or PHP create a new thread for every incoming request. Node.js uses one thread for all requests. That difference changes everything.


The Chef-Handling-Orders Analogy

Let me build this analogy fully because it maps perfectly to how Node.js works.

Traditional Server = One Chef Per Order

Order 1 → Chef 1 starts cooking → WAITS for oven → plates food → serves
Order 2 → Chef 2 starts cooking → WAITS for oven → plates food → serves
Order 3 → Chef 3 starts cooking → WAITS for oven → plates food → serves
Order 4 → No available chef! → WAIT IN LINE

Each chef handles ONE order from start to finish.
Most of the time, chefs are STANDING AROUND waiting for ovens.
More orders = more chefs needed = more expensive.
Enter fullscreen mode Exit fullscreen mode

Node.js = One Head Chef + Kitchen Helpers

Order 1 → Head Chef preps ingredients → puts in Oven A → MOVES ON
Order 2 → Head Chef preps ingredients → puts in Oven B → MOVES ON
Order 3 → Head Chef preps ingredients → starts a timer → MOVES ON
Order 4 → Head Chef preps ingredients → puts in Oven C → MOVES ON

  *DING!* Oven B ready → Head Chef plates Order 2 → serves
  *DING!* Timer done    → Head Chef plates Order 3 → serves
  *DING!* Oven A ready  → Head Chef plates Order 1 → serves
  *DING!* Oven C ready  → Head Chef plates Order 4 → serves

One chef. Many orders. The ovens (background workers) do the slow work.
The chef NEVER stands around waiting. Always prepping or plating.
Enter fullscreen mode Exit fullscreen mode

In this analogy:

Kitchen Node.js
Head Chef Main thread (your JavaScript)
Ovens, timers, mixers Background workers (libuv threads)
Order queue Task queue (callbacks waiting)
Kitchen bell (DING!) Event (I/O operation completed)
Plating food Running the callback

Single-Threaded Nature of Node.js

Let's be precise about what "single-threaded" means in Node.js:

Your JavaScript code runs on a single thread. That's it. One line at a time. One function at a time. One callback at a time. There is no second thread running your JavaScript concurrently.

// These run ONE AT A TIME, never in parallel
app.get("/user", (req, res) => {
  // Handle user request
});

app.get("/products", (req, res) => {
  // Handle product request
});

// If both requests arrive simultaneously,
// Node.js processes their callbacks ONE AFTER THE OTHER
// But so fast that it feels simultaneous to the clients
Enter fullscreen mode Exit fullscreen mode

What "Single-Threaded" Does NOT Mean

It does NOT mean Node.js only has one thread total. Internally, libuv (the C++ library that handles I/O) maintains a thread pool (default: 4 threads) for operations that can't be done asynchronously at the OS level:

Node.js internals:

  ┌─────────────────────────┐
  │  Main Thread            │  ← Your JavaScript
  │  (single, event loop)   │
  └───────────┬─────────────┘
              │ delegates I/O to:
  ┌───────────┴─────────────────────────────┐
  │        libuv Thread Pool                │
  │  ┌──────┐ ┌──────┐ ┌──────┐ ┌──────┐   │
  │  │ T1   │ │ T2   │ │ T3   │ │ T4   │   │
  │  │(file)│ │(DNS) │ │(file)│ │(idle)│   │
  │  └──────┘ └──────┘ └──────┘ └──────┘   │
  │                                         │
  │  + OS-level async (network I/O)         │
  └─────────────────────────────────────────┘
Enter fullscreen mode Exit fullscreen mode

Your code is single-threaded. The system underneath is not. That's the secret.


How Multiple Client Requests Are Handled

Let's trace exactly what happens when three requests arrive at a Node.js server:

const express = require("express");
const app = express();

app.get("/api/user", async (req, res) => {
  const user = await db.query("SELECT * FROM users WHERE id = 1");
  res.json(user);
});

app.listen(3000);
Enter fullscreen mode Exit fullscreen mode

Three clients request /api/user at the same time:

Time 0ms — Client A requests /api/user
─────────────────────────────────────────
  Main Thread:
    1. Receive request A
    2. Start database query for A → delegated to system
    3. Don't wait! Thread is FREE for the next request.

Time 1ms — Client B requests /api/user
─────────────────────────────────────────
  Main Thread:
    4. Receive request B
    5. Start database query for B → delegated to system
    6. Don't wait! Thread is FREE again.

Time 2ms — Client C requests /api/user
─────────────────────────────────────────
  Main Thread:
    7. Receive request C
    8. Start database query for C → delegated to system
    9. Don't wait! Thread is FREE.

Time 50ms — Database responds with A's data
─────────────────────────────────────────
  Event loop:
    10. Callback for A enters the queue
    11. Main thread picks it up → sends response to Client A

Time 52ms — Database responds with C's data
─────────────────────────────────────────
  Event loop:
    12. Callback for C enters the queue
    13. Main thread picks it up → sends response to Client C

Time 55ms — Database responds with B's data
─────────────────────────────────────────
  Event loop:
    14. Callback for B enters the queue
    15. Main thread picks it up → sends response to Client B

Total: All 3 clients got responses in ~55ms
       (NOT 150ms, which is what sequential would take)
Enter fullscreen mode Exit fullscreen mode

Single Thread Handling Multiple Requests — Visual

Main Thread Timeline:
──────────────────────────────────────────────────────────

  0ms    1ms    2ms         50ms   52ms   55ms
  │      │      │            │      │      │
  ▼      ▼      ▼            ▼      ▼      ▼
 [A]    [B]    [C]   idle   [A✓]   [C✓]   [B✓]
 start  start  start  ...   respond respond respond
 query  query  query        to A    to C    to B

  ├──────────────────────────┤
   Thread is FREE during      
   this entire time.          
   Accepting new requests!    

Background Workers Timeline:
──────────────────────────────────────────────────────────

  Worker 1: ████████████████████████████ A's DB query (50ms)
  Worker 2: █████████████████████████████████ B's DB query (55ms)
  Worker 3: ██████████████████████████████ C's DB query (52ms)

  Workers handle the slow I/O in parallel.
  Main thread stays fast and responsive.
Enter fullscreen mode Exit fullscreen mode

Event Loop + Worker Thread Interaction Flow

Here's the complete interaction between all the parts:

┌────────────┐
│   Client   │ ──→ HTTP Request arrives
└─────┬──────┘
      │
      ↓
┌──────────────────────────────────────────────────────┐
│                    MAIN THREAD                        │
│                                                      │
│  1. Parse the request                                │
│  2. Run your route handler (JavaScript)              │
│  3. Hit an I/O operation (db.query, fs.read, etc.)   │
│  4. Delegate it → hand off to libuv                  │
│  5. Move on to next request (DON'T WAIT)             │
│                                                      │
└──────────────────┬───────────────────────────────────┘
                   │
                   │ delegate I/O
                   ↓
┌──────────────────────────────────────────────────────┐
│              LIBUV (Background)                       │
│                                                      │
│  Thread Pool          │    OS Async I/O              │
│  ┌────┐ ┌────┐        │    ┌──────────────┐          │
│  │ T1 │ │ T2 │        │    │ Network I/O  │          │
│  │file│ │ DNS│        │    │ (handled by  │          │
│  └────┘ └────┘        │    │  OS kernel)  │          │
│  ┌────┐ ┌────┐        │    └──────────────┘          │
│  │ T3 │ │ T4 │        │                              │
│  │file│ │idle│        │                              │
│  └────┘ └────┘        │                              │
│                        │                              │
│  When done → push callback to queue                  │
└──────────────────┬───────────────────────────────────┘
                   │
                   │ callback ready
                   ↓
┌──────────────────────────────────────────────────────┐
│                  EVENT LOOP                           │
│                                                      │
│  "Main thread free? → Run this callback."            │
│                                                      │
└──────────────────┬───────────────────────────────────┘
                   │
                   ↓
┌──────────────────────────────────────────────────────┐
│                  MAIN THREAD (again)                  │
│                                                      │
│  6. Run the callback with the I/O result             │
│  7. Send HTTP response to client                     │
│                                                      │
└──────────────────────────────────────────────────────┘
Enter fullscreen mode Exit fullscreen mode

The main thread starts the work and finishes the work. The slow part in the middle is handled by someone else entirely.


Concurrency, Not Parallelism

This distinction is crucial and comes up in interviews constantly.

Parallelism: Multiple things happening at the exact same time on different cores/threads.

Concurrency: Multiple things being managed at the same time, but not necessarily running simultaneously.

Node.js achieves concurrency with its JavaScript code. It manages thousands of requests at once, switching between them so fast that clients never notice.

PARALLELISM (multi-threaded Java server):

  Core 1: ████████████ Request A processing
  Core 2: ████████████ Request B processing
  Core 3: ████████████ Request C processing
  Core 4: ████████████ Request D processing

  4 things happening LITERALLY at the same time.

CONCURRENCY (Node.js):

  Main Thread:
  [A][B][C][D]...[A✓][C✓][B✓][D✓]
   ↑  ↑  ↑  ↑     ↑   ↑   ↑   ↑
  start each    respond as results
  request       arrive from workers

  1 thread juggling many tasks.
  Appears simultaneous to clients.
Enter fullscreen mode Exit fullscreen mode

However, the background workers (libuv thread pool, OS async I/O) do work in parallel. So Node.js uses parallelism for I/O — just not for your JavaScript code.


Why Node.js Scales Well

Let's quantify the difference:

Memory Comparison

Concurrent Requests Traditional (thread-per-request) Node.js (event loop)
100 100 threads × 2MB = 200MB 1 thread + callbacks = ~20MB
1,000 1,000 threads = 2GB Same 1 thread = ~30MB
10,000 10,000 threads = 20GB 💥 Same 1 thread = ~50MB
100,000 Impossible without clustering Possible with tuning

Startup Cost

  • Thread creation: ~1ms per thread + memory allocation
  • Node.js callback: ~microseconds, negligible memory

Context Switching

Multi-threaded servers spend significant CPU time switching between threads. Node.js has no thread switching overhead for JavaScript — there's only one thread to switch to.

The Real-World Impact

This is why:

  • LinkedIn went from 30 servers (Ruby) to 3 servers (Node.js)
  • PayPal doubled requests per second after switching from Java
  • Walmart handled 500 million Black Friday page views without downtime

When Single-Threaded Becomes a Problem

The single-threaded model has one major weakness: CPU-intensive tasks block the event loop.

// ❌ This blocks EVERYTHING for ~3 seconds
app.get("/heavy", (req, res) => {
  // CPU-intensive computation
  let sum = 0;
  for (let i = 0; i < 5_000_000_000; i++) {
    sum += i;
  }
  res.json({ sum });
});

// While /heavy is computing, NO other request can be handled.
// The event loop is STUCK.
Enter fullscreen mode Exit fullscreen mode

Solutions

  1. Worker Threads — offload CPU work to a separate thread:
const { Worker } = require("worker_threads");

app.get("/heavy", (req, res) => {
  const worker = new Worker("./heavy-computation.js");
  worker.on("message", (result) => {
    res.json({ sum: result });
  });
});
// Main thread stays free!
Enter fullscreen mode Exit fullscreen mode
  1. Cluster Mode — run multiple Node.js processes:
const cluster = require("cluster");
const os = require("os");

if (cluster.isPrimary) {
  // Fork one worker per CPU core
  const numCPUs = os.cpus().length;
  for (let i = 0; i < numCPUs; i++) {
    cluster.fork();
  }
} else {
  // Each worker runs the server
  app.listen(3000);
}
Enter fullscreen mode Exit fullscreen mode
  1. Break up computation — use setImmediate to yield back to the event loop between chunks.

Let's Practice: Hands-On Assignment

Part 1: See Concurrency in Action

const http = require("http");

const server = http.createServer((req, res) => {
  const requestId = Date.now();
  console.log(`[${requestId}] Request received`);

  // Simulate database query (100ms)
  setTimeout(() => {
    console.log(`[${requestId}] Response sent`);
    res.end(`Request ${requestId} handled!\n`);
  }, 100);
});

server.listen(3000, () => {
  console.log("Server on http://localhost:3000");
  console.log("Open multiple browser tabs at once!");
});
Enter fullscreen mode Exit fullscreen mode

Open 5 tabs simultaneously — all 5 responses arrive in ~100ms, not 500ms.

Part 2: Experience Event Loop Blocking

const http = require("http");

const server = http.createServer((req, res) => {
  if (req.url === "/fast") {
    res.end("Fast response!\n");
  }

  if (req.url === "/slow") {
    // Block the event loop for 5 seconds
    const end = Date.now() + 5000;
    while (Date.now() < end) {}
    res.end("Slow response (blocked for 5 seconds)\n");
  }
});

server.listen(3000, () => {
  console.log("Try /fast and then /slow");
  console.log("While /slow runs, /fast won't respond either!");
});
Enter fullscreen mode Exit fullscreen mode

Visit /slow, then quickly try /fast in another tab. Notice how /fast is also blocked — proof that one CPU-heavy task blocks the entire thread.

Part 3: Non-Blocking Alternative

const http = require("http");

const server = http.createServer((req, res) => {
  if (req.url === "/fast") {
    res.end("Fast response!\n");
  }

  if (req.url === "/slow") {
    // Non-blocking delay — doesn't block the event loop!
    setTimeout(() => {
      res.end("Slow response (waited 5 seconds, but didn't block!)\n");
    }, 5000);
  }
});

server.listen(3000, () => {
  console.log("Try /slow, then /fast — /fast responds instantly!");
});
Enter fullscreen mode Exit fullscreen mode

Now /fast responds immediately even while /slow is waiting. The event loop stays free.


Key Takeaways

  1. Node.js runs your JavaScript on one thread, but delegates I/O operations to background workers (libuv thread pool + OS async I/O). It's not truly "one thread does everything."
  2. The chef analogy: one head chef manages many orders by delegating cooking to ovens (background workers) and plating food when the kitchen bell rings (callbacks).
  3. Node.js achieves concurrency, not parallelism — it manages many tasks simultaneously by switching between them efficiently, not by running them at the same time.
  4. Node.js scales well because it avoids thread-per-request overhead — 10,000 connections need ~50MB instead of ~20GB.
  5. The weakness is CPU-intensive tasks — they block the single thread. Use Worker Threads or Cluster mode to solve this.

Wrapping Up

The single-threaded model sounds like a limitation, but it's actually Node.js's superpower. By not creating threads for every request, Node.js avoids the memory overhead, the context switching cost, and the complexity of multi-threaded programming. Instead, one thread orchestrates everything — starting I/O, moving on, handling callbacks — and the result is a server that scales to thousands of concurrent connections with minimal resources.

I'm learning all of this through the ChaiCode Web Dev Cohort 2026 under Hitesh Chaudhary and Piyush Garg. Understanding how Node.js handles multiple requests was the moment where its architecture stopped being abstract theory and became something I could explain in an interview. If this concept clicks for you, you're in a really strong position.

Connect with me on LinkedIn or visit PrathamDEV.in. More articles on the way.

Happy coding! 🚀


Written by Pratham Bhardwaj | Web Dev Cohort 2026, ChaiCode

Top comments (0)