How Node.js Handles Multiple Requests with a Single Thread

#node

Hello readers 👋, welcome to the 6th blog of our Node.js journey!

In the last post, we explored the crucial difference between blocking and non-blocking code. We saw that a single blocking call can freeze an entire server, while non-blocking code keeps it responsive. Today, we tackle the question that naturally follows: if Node.js runs JavaScript on just one thread, how on earth does it handle thousands of requests at the same time without falling apart?

It sounds impossible. A single checkout line in a supermarket can only process one customer at a time. Yet Node.js servers routinely manage tens of thousands of concurrent connections. The answer lies in a brilliant combination of the event loop, non-blocking I/O, and background workers. Let's break it down.

The single-threaded nature of Node.js

First, let's be clear: Node.js executes your JavaScript code in a single main thread. That means when you write:

console.log("Step 1");
console.log("Step 2");

These two lines never run simultaneously. Step 1 finishes before Step 2 starts. This single-threaded design simplifies your code because you don't have to worry about two pieces of code modifying the same variable at the same time.

But a traditional server that uses one thread per request (like many Java or PHP setups) would quickly run out of threads under high load. Each thread consumes memory, and context switching between hundreds of threads wastes CPU. Node.js takes a totally different path. It keeps the single thread but never lets it sit idle.

The magic comes from how it delegates work outside that thread. The main thread is like a project manager who never does heavy lifting; they hand tasks off to specialists and only get involved when results are ready.

The event loop: the master coordinator

The event loop is the core mechanism that enables this pattern. It's a continuously running loop that watches two things: the call stack (where functions execute) and the task queue (where callbacks wait). When the stack is empty, the loop picks the next callback from the queue and pushes it onto the stack for execution.

Think of the event loop as a restaurant's head chef. She can only cook one dish at a time (single thread). But she doesn't peel potatoes or wait for water to boil. She delegates those tasks to assistants (background workers). While they work, she moves on to the next order. When an assistant finishes, they call out "Done!" and the chef briefly pauses, plates the component, and continues.

In Node.js, this "delegation" happens whenever you call an asynchronous function like fs.readFile, http.get, or a database query. The main thread dispatches the I/O operation, registers a callback, and moves on to process the next request immediately.

Delegating tasks to background workers

Where does the actual I/O happen? Node.js uses two main mechanisms:

Operating system kernel: For network I/O, the OS provides non-blocking system calls (like epoll on Linux, kqueue on macOS). The main thread registers a socket with the kernel and asks to be notified when data arrives. The kernel handles the waiting, and the main thread carries on with other work. When data is ready, a callback is queued.
The libuv thread pool: For operations that don't have non-blocking OS support (like file system calls or some DNS lookups), libuv, the library that powers the event loop, maintains a pool of background threads. When you call fs.readFile, libuv grabs a thread from the pool, reads the file content on that thread, and then, once done, pushes the callback into the event loop's queue to be executed on the main thread.

In both cases, the main thread never blocks. It just schedules work and handles results. The heavy I/O happens elsewhere.

Handling multiple client requests: a step-by-step view

Imagine a Node.js HTTP server that logs the request, then responds after a short delay (simulating a database lookup). Here's the code:

const http = require("http");

http.createServer((req, res) => {
  console.log("Request received:", req.url);
  setTimeout(() => {
    res.writeHead(200, { "Content-Type": "text/plain" });
    res.end("Response for " + req.url);
    console.log("Response sent:", req.url);
  }, 100);
}).listen(3000, () => {
  console.log("Server running on port 3000");
});

Now, open three tabs and quickly hit the server with different URLs. If Node.js were single-threaded in a blocking way, each request would need to finish before the next one even starts. But what actually happens is:

Request A arrives. The main thread logs "Request received: /a" and calls setTimeout. This schedules a timer in libuv and immediately returns.
The main thread is free. Request B arrives instantly. It logs "Request received: /b", sets another timer, and returns.
Request C arrives, logs, sets a timer.
After about 100ms, the timers start firing. Libuv queues the callbacks. The event loop picks them up one by one, and the main thread sends responses: "Response sent: /a", then "Response sent: /b", then "Response sent: /c".

All three requests were accepted and processed concurrently, using a single JavaScript thread that never waited for the 100ms delay. The 100ms was spent in the background, not on the main thread.

This is concurrency without parallelism.

Why Node.js scales so well

The secret to Node.js's scalability is that it doesn't dedicate a thread to each connection. In a traditional threaded server, if you have 10,000 concurrent connections, you might have 10,000 threads. Each thread consumes roughly 1 MB of stack space, so that's 10 GB of memory just for thread stacks. Plus, the OS scheduler wastes CPU switching among them.

In Node.js, the memory footprint per connection is minimal: a small amount of state to remember the request, and a callback function. The event loop efficiently manages all of them on a single thread. This allows a Node.js process to comfortably handle tens of thousands of open connections (like WebSocket sessions) on modest hardware.

Moreover, the event loop model naturally matches I/O heavy workloads. Most web servers spend the majority of their time waiting for the database, filesystem, or external APIs. Node.js turns that waiting into an opportunity to serve other requests.

Concurrency vs parallelism: a crucial distinction

It's vital to understand that Node.js provides concurrency, not parallelism, within a single process.

Concurrency: Multiple tasks are in progress at the same time, but not necessarily executing simultaneously. The single thread switches between them so fast that it looks like they run together.
Parallelism: Multiple tasks literally run at the same instant on different CPU cores.

Node.js is concurrent through the event loop. But you can achieve parallelism by running multiple Node.js processes (clustering) or using worker threads for CPU-intensive tasks. However, the default model, and the one we're discussing, is single-threaded concurrency.

The head chef analogy makes this clear. The chef is concurrent: she manages several dishes at once, keeping them all moving forward. But she only has two hands; she cannot physically chop and stir at the same instant. She is not parallel. If she needs to chop fifty onions (a CPU-heavy task), she would block the whole kitchen. So she hires an assistant (a worker thread) to do that in parallel. For most server tasks (I/O), the assistants are already built in (the OS and libuv).

Visualizing the single thread handling multiple requests

Here's a mental picture of the flow:

Time 0ms:   Request A arrives → main thread logs, starts async I/O, moves on
Time 1ms:   Request B arrives → main thread logs, starts async I/O, moves on
Time 2ms:   Request C arrives → main thread logs, starts async I/O, moves on
...
Time 100ms: I/O for Request A completes → callback queued
            I/O for Request B completes → callback queued
Time 101ms: Event loop picks A's callback → main thread sends response A
Time 102ms: Event loop picks B's callback → main thread sends response B
...

The main thread was blocked for only a tiny fraction of the total time (the logging and response sending). The actual waiting happened in the background, concurrently, for all requests.

Conclusion

Node.js handles multiple requests with a single thread by never waiting. It delegates I/O tasks to the OS kernel or a background thread pool, uses an event loop to manage callbacks, and keeps its main thread free to accept new work. This is a radical departure from the thread-per-connection model and is the reason Node.js excels at building fast, scalable network applications.

To recap:

Node.js JavaScript runs on one thread, ensuring simplicity and avoiding synchronization bugs.
The event loop orchestrates concurrency by continuously dispatching callbacks when the stack is empty.
Time-consuming I/O is handed off to the OS (non-blocking) or to a libuv thread pool, never blocking the main thread.
Multiple client requests are handled by interleaving their short, non-blocking pieces, giving the illusion of simultaneous processing.
This model scales well because memory per connection is tiny, and CPU time isn't wasted on thread management.
Node.js provides concurrency, not parallelism, on a single process. For true parallel computation, you can use clustering or worker threads.

Now that you understand this foundational concept, you're ready to build applications that can handle massive concurrency with ease. In the next post, we'll explore some of the most commonly used built-in modules that Node.js provides to make all this power accessible.

Hope you found this helpful! If you spot any mistakes or have suggestions, let me know. You can find me on LinkedIn and X, where I post more about web development.