Naga Rohith

Posted on Nov 19

Inside Node.js: A Deep Dive into V8, libuv, the Event Loop & Thread Pool

#node #backend #webdev #programming

Hey there! If you've ever used Node.js, you've probably heard the terms "non-blocking I/O," "single-threaded," or "event loop." We often use Node.js as a black box we put JavaScript in, and performance comes out. But what's really happening under the hood?

What is the event loop? How can Node be "single-threaded" but handle thousands of connections? What's V8, and how is it different from libuv?

If you're a developer looking to truly master Node.js, or just a curious mind who wants to peek behind the curtain, this guide is for you. We're not just scratching the surface. We're diving deep into Node.js internal working.

This is the ultimate guide to Node.js internals. Let's get started.

The V8 Engine

The V8 engine is the high-performance JavaScript and WebAssembly engine created by Google and used by both Chrome and Node.js. At its core, V8 is responsible for taking JavaScript source code, parsing it, converting it into an internal representation, and ultimately compiling it into highly optimized machine code that runs directly on the CPU. Unlike older JavaScript engines that relied heavily on slow interpreters, V8 uses a modern architecture consisting of a parser, an interpreter called Ignition, and an advanced optimizing compiler called TurboFan, which work together to provide fast startup times and aggressive runtime optimizations.

It manages memory through a generational garbage collector, uses hidden classes and inline caching to speed up property lookups, and applies several optimization heuristics at runtime based on how your code behaves. Understanding how V8 compiles, optimizes, and executes JavaScript is fundamental to understanding Node.js performance, because almost everything that happens in a Node application including closures, async callbacks, microtasks, event loop execution, and memory allocation ultimately runs inside the V8 execution environment.

Ignition Interpreter & TurboFan Compiler

In the old days, JavaScript was purely interpreted, which was slow. Then, compilers came along, which were faster but had a high startup cost. V8 uses a hybrid approach.

Ignition Interpreter
When your code first runs, it's fed to the Ignition interpreter. Ignition's job is to execute the code line-by-line as quickly as possible. This turns the code into bytecode an intermediate, lower-level language closer to machine instructions but still flexible for quick startup. It doesn't waste time on optimization. While it's running, Ignition also gathers profiling data like which functions are called often, or which types of variables are used.This means your app starts fast, without waiting for all code to be compiled.
TurboFan Optimizing Compiler
When Ignition flags a piece of code as "hot" (e.g., a function that's been run 1000 times or When it spots pieces getting a lot of action think loops or hot functions), it passes that code and the profiling data to TurboFan. TurboFan is an optimizing compiler. It takes its time, looks at the profiling data, and makes smart assumptions to generate hyper-optimized machine code delivering the best of both worlds: immediate startup and high performance over time.

Hidden Classes

JavaScript objects look simple on the surface just key-value pairs. But underneath, V8 uses a concept called hidden classes (sometimes called "maps" or "shapes") to optimize property access speed.
When you create an object in JavaScript, like:

const user = {};
user.name = "Alice";
user.age = 30;

V8 doesn’t just store properties randomly. Instead, it creates a hidden class that acts as an internal blueprint describing the layout of the object’s properties in memory. Step-by-step:

When you do const user = {}, V8 creates an initial hidden class (say C0) with no properties.
When you add user.name = "Alice", it creates a new hidden class C1 that extends C0 by adding a property named name at a specific memory offset.
When you add user.age = 30, it creates hidden class C2, with both name and age properties, each at fixed offsets.

Creating another object with the same property additions in the same order causes V8 to reuse the existing hidden classes, enabling fast, predictable property lookups based on known memory layouts.

Inline Caching

Caching is the process of storing the result of an expensive operation (like a database query or complex calculation) so that future requests for that same data can be served instantly without repeating the work.

Inline Caching is a specific optimization technique where the engine stores the result of a property lookup directly within the compiled machine code at the "call site" (the specific line where the function is called), eliminating the need to look up where a property lives in memory every time.

V8 Uses It Inline Caching acts as a turbocharger for Hidden Classes. When V8 accesses a property for the first time, it calculates the memory location and "patches" the code with a shortcut (a stub) pointing directly to that memory offset. If subsequent objects share the same Hidden Class (monomorphic), V8 uses this shortcut to skip the lookup entirely, achieving near-native speeds. However, if the object shapes constantly change (polymorphic), V8 is forced to abandon these optimized stubs and revert to slower, generic lookup methods.

Memory Layout

V8 manages memory by dividing it primarily into two areas: the stack and the heap, with the heap further subdivided for optimal garbage collection.

The stack is a small, managed memory region where V8 stores static data such as function call frames, primitive values and pointers to objects on the heap. This memory is tightly managed by the operating system and follows a Last-In-First-Out (LIFO) pattern, making it very fast for local variable access and function calls.

The heap, on the other hand, is where dynamic data lives this is where all JavaScript objects, arrays, functions, and closures are allocated. Because JavaScript is dynamic and objects can change shape and size at runtime, the heap must be flexible and efficiently managed by the garbage collector to avoid memory leaks and fragmentation.

Inside the heap, V8 implements a generational memory model by dividing it into:

New Space (Young Generation): This is a smaller area dedicated to newly created objects. Most of these objects are short-lived, such as temporary variables or intermediate results. Because many objects become unreachable quickly, V8 runs a fast, frequent garbage collection (called minor GC) here to reclaim memory efficiently.
Old Space (Old Generation): Objects that survive several minor garbage collections in the new space are promoted to the old space. This area holds longer-lived objects like caches, application data structures, or closures persisting across many function calls. Garbage collection in old space is more comprehensive but less frequent, involving techniques like mark-sweep and mark-compact to optimize memory usage over time.

This separation allows V8 to optimize around common JavaScript object lifecycles: quick cleanup of transient objects in new space and thorough maintenance of persistent objects in old space, ensuring better performance and minimal interruption during program execution.

2. Node.js Architecture: More Than Just V8

Node.js is not V8. Node.js is a runtime that uses V8.

Node.js is designed for building scalable network applications by leveraging a lightweight, event-driven architecture. As it runs on the V8 JavaScript engine, which compiles JavaScript code into fast machine code, enabling efficient execution on the server side.

The Single-Threaded Event Loop

Unlike traditional multi-threaded servers that spawn multiple threads per request, Node.js operates on a single main thread known as the event loop. This event loop continuously monitors an event queue where asynchronous events, such as incoming HTTP requests, timers, or I/O completions—are placed.

Because the event loop handles tasks one at a time and delegates blocking tasks to background threads, Node.js efficiently manages thousands of concurrent connections with low overhead.

libuv and Thread Pool for Async I/O

To handle blocking operations like file system access or database calls without blocking the event loop, Node.js uses libuv, a C library that provides an abstraction for asynchronous I/O.

Libuv maintains a thread pool (default size 4) that executes heavy, blocking operations in parallel. When these operations complete, their callbacks are queued back on the event loop, allowing your JavaScript code to continue processing.

This design keeps the main thread free and responsive, achieving non-blocking concurrency in a single-threaded execution environment.

Event-Driven Programming Model

Node.js popularized the event-driven style of programming, where events emitted by the system or users trigger asynchronously executed callback functions. Core Node modules and frameworks like Express.js make extensive use of this pattern to build real-time features such as websockets, APIs, and streaming data applications.

3. Understanding the Node.Js Event Loop

The event loop is the core mechanism that enables Node.js to handle asynchronous operations on a single thread. Instead of blocking the main thread waiting for operations to complete (like I/O), Node.js uses the event loop to schedule callbacks and manage concurrency efficiently.

Node.js has an event queue where callbacks from asynchronous operations are placed. The event loop continuously checks this queue and processes callbacks one by one, making Node.js non-blocking.

Here's a simple example demonstrating how asynchronous callbacks enter the event loop:

console.log('Start');

setTimeout(() => {
  console.log('Timeout callback');
}, 0);

console.log('End');

Output:
Start End Timeout callback
Even with a timeout of 0, the callback runs after the synchronous code because it waits for the event loop to pick it up.

The Six Phases of the Event Loop

The event loop executes in a cycle consisiting of six distinct phases:

Timers
Executes callbacks scheduled by setTimeout() and setInterval() whose timer thresholds have elapsed.
Pending Callbacks
Executes I/O callbacks deferred to the next loop iteration, such as some TCP errors.
Idle, Prepare
Internal operations for Node.js—used to prepare the event loop.
Poll
It retrieves new I/O events (like a network request completing).
It executes the callbacks for those I/O events (like the (req, res) in an HTTP server).
If the loop has nothing else to do, it will block and "poll" the OS here, waiting for new events to arrive.
Check
Executes callbacks scheduled by setImmediate().
Close Callbacks
Executes callbacks for closed events like sockets or handles.

Macrotasks vs. Microtasks

This is where most people get confused. The 6 phases above handle Macrotasks. A setTimeout callback is a macrotask. A setImmediate callback is a macrotask. An I/O callback is a macrotask.

Microtasks are different. They live in their own queues and have higher priority. They are:

process.nextTick() callbacks
Promise callbacks (.then(), .catch(), .finally(), and await)

Here is the golden rule:

After every Macrotask, and before the Event Loop moves to the next phase, it will completely drain the Microtask queue.

Let's trace:

Event Loop enters the timers phase.
It finds one setTimeout callback (a Macrotask) and executes it.
STOP! Before moving to the pending callbacks phase, the loop checks the Microtask queue.
It finds 10 Promise .then() callbacks. It runs all 10.
The Microtask queue is now empty.
Now the loop moves on to the pending callbacks phase.

How Promises "Jump" Phases
This Microtask behavior is why promises can "jump" the line.

const fs = require('fs');

// Macrotask (Poll Phase)
fs.readFile('file.txt', () => {
  console.log('1. I/O');

  // Macrotask (Check Phase)
  setImmediate(() => console.log('2. Immediate'));

  // Microtask (Promise)
  Promise.resolve().then(() => console.log('3. Promise'));
});

// Macrotask (Timer Phase)
setTimeout(() => console.log('4. Timeout'), 0);

Output:
`4. Timeout

I/O
Promise
Immediate`

Why?

The setTimeout (timer) and fs.readFile (I/O) are initiated.
The loop hits the timers phase. It finds the setTimeout callback and executes it. 4. Timeout is logged.
The loop moves through pending, idle, and into the poll phase.
It finds the completed fs.readFile callback (a Macrotask). It executes it. 1. I/O is logged.
Inside that callback, a setImmediate (Macrotask for check phase) and a Promise.resolve (Microtask) are queued.
STOP! The fs.readFile macrotask is done. The loop must drain the microtask queue before moving to the check phase.
It finds the promise callback and executes it. 3. Promise is logged.
The microtask queue is empty.
The loop moves to the check phase. It finds the setImmediate callback and executes it. 2. Immediate is logged.

Avoiding Event Loop Starvation:

Heavy use of microtasks (e.g., using too many process.nextTick() calls) can starve the event loop, preventing it from moving to the I/O phases. This blocks I/O and timers from executing, causing application delays.

The event loop orchestrates the asynchronous behavior that powers Node.js's concurrency under the hood. Understanding its phases, microtask vs macrotask queues, and nuances like setImmediate() helps build highly performant and scalable applications.

4. libuv Internals

So, libuv gives us the Event Loop. But how does it actually handle I/O? How does it wait for 10,000 network connections at once without blocking?

Event Demultiplexer

This is the core. Libuv doesn't check every socket one by one. That would be slow (this is "select"). Instead, it uses the most efficient mechanism available on the host OS:

epoll on Linux
kqueue on macOS and BSD
IOCP (I/O Completion Ports) on Windows

It gives the OS kernel a list of all the sockets and files it cares about and says, "Hey, I'm going to sleep. Wake me up only when one of these has data to read, is ready to write, or has an error."

This Event Demultiplexer is a single C function call (like epoll_wait) that efficiently waits for any event to happen. When it returns, it gives libuv a list of only the events that are ready. This is why Node can handle immense I/O concurrency with a single thread.

Handles & Requests

Inside libuv, everything is one of two things:

Handles: These represent long-lived objects that can perform operations. A TCP server (net.Server) is a handle. A timer (setTimeout) is a handle.
Requests: These represent short-lived, one-off operations. A fs.readFile operation is a request. A dns.lookup is a request.

When you call fs.readFile, Node's C++ bindings create a fs_req (request) object, give it your callback, and hand it to libuv.

Queues
Libuv maintains all the queues for the Event Loop phases. When the Event Demultiplexer says, "Socket 5 has data," libuv finds the handle for Socket 5, executes its C-level read operation, and then queues the JavaScript callback (with the data) to be run in the poll phase.

The Thread Pool: Handling the Heavy Lifting

Your JavaScript code and the Event Loop run on a single main thread. But Node.js itself (and libuv) is not single-threaded.

Libuv maintains a Thread Pool (by default, 4 threads) to handle operations that are unavoidably blocking or CPU-intensive.

Why 4 Threads?

It's just a default. It was a good "guess" that works for most 4-core CPUs. You can change this by setting the UV_THREADPOOL_SIZE environment variable before your Node process starts.

UV_THREADPOOL_SIZE=8 node my_app.js

Which Operations Use the Thread Pool?

This is a critical distinction.

Network I/O does NOT use the thread pool. Modern OS-level APIs (epoll, kqueue) are already non-blocking. Libuv handles network I/O on the main thread via the Event Demultiplexer.
Blocking I/O and CPU-Bound tasks DO use the thread pool. This includes:
All fs module operations (e.g., fs.readFile, fs.stat) because file system access is (on most platforms) a blocking OS call.
Most crypto functions (e.g., crypto.pbkdf2, crypto.randomBytes) because they are very CPU-intensive.
dns.lookup (but not dns.resolve, which is network-based).
zlib for compression/decompression.

Performance Considerations

Imagine you have a 4-core server (default 4 threads) and you get 5 requests at once to hash a password using crypto.pbkdf2.

The first 4 requests will be dispatched to the 4 threads in the pool.
The 5th request must wait for one of the first 4 to finish before it can even start.
While this is happening, any fs.readFile calls will also have to wait for a free thread.

This is a bottleneck. Your event loop might be free, but your thread pool is saturated. Increasing the pool size might help, but the real, modern solution for CPU-bound work is to use Worker Threads (which are separate Node.js runtimes, not from the libuv thread pool) to run your JS in parallel.

Worker threads:

These are separate JavaScript threads exposed explicitly via the Node.js worker_threads module. Unlike the libuv thread pool, worker threads allow running JavaScript code in parallel threads managed by the user. Each worker has its own event loop and memory space, communicating with the main thread via messaging.

Never block the event loop with heavy computation. Delegate it! Use the thread pool or proper worker threads. Always measure and tune thread pool size based on workload.

How Everything Works Together in Node.js

JavaScript Execution with V8 Engine
Your Node.js application code is executed by the V8 engine. V8 compiles JavaScript into fast machine code, enabling efficient synchronous execution on the main thread.
Single-Threaded Event Loop Orchestrates Concurrency
The main thread runs the event loop, which continuously polls the event queue for incoming asynchronous events and callbacks to execute one by one.
Delegation to libuv for Asynchronous I/O
When your code makes asynchronous calls like file system access, network requests, or timers, Node.js delegates these to libuv. libuv either registers these with the OS's native asynchronous APIs (epoll, kqueue, IOCP) or pushes blocking tasks to its internal thread pool.
libuv Thread Pool for Blocking Operations
For operations that cannot be performed asynchronously by the OS, libuv uses a configurable thread pool (default 4 threads) to run these tasks in parallel worker threads without blocking the main event loop.
Callback Queueing and Event Loop Processing
Once libuv completes I/O or blocking tasks, it queues the associated callbacks back onto the event loop’s event queue for execution on the main thread.
Worker Threads for Parallel JavaScript Execution
Separately, Node.js supports worker threads that run JavaScript code in parallel, each with their own event loops and memory. These are created explicitly via the worker_threads module for CPU-intensive or parallelizable workloads.
Event-Driven Architecture Enables Responsive Apps
Events emitted by the system or user trigger registered callbacks, facilitating non-blocking, reactive application design suited for real-time scenarios.

JS code  →  V8 (executes sync JS)
    ↓
Async API call
    ↓
libuv registers I/O or delegates to thread pool
    ↓
Blocking tasks → Worker threads (libuv pool)
    ↓
I/O or task completion signals libuv
    ↓
Callbacks queued in event loop
    ↓
Main thread runs callbacks via event loop

Each component seamlessly collaborates to maximize Node.js scalability and responsiveness, handling thousands of concurrent operations with minimal overhead.

Conclusion

Node.js combines the fast V8 engine, an efficient single-threaded event loop, and libuv’s async I/O with a thread pool to handle blocking tasks smoothly. Worker threads enable parallel JavaScript execution for CPU-heavy jobs. This architecture lets Node.js scale easily and stay responsive, making it ideal for real-time and I/O-intensive applications.

Understanding this helps developers write better-performing, scalable apps that fully leverage Node.js’s strengths.

If you found this helpful, don’t forget to like, comment, and follow to see more blogs. Let’s keep learning together. Happy Coding!

DEV Community