Proficiency in Node.js API can get you going fast, but a profound understanding of the memory footprint of Node.js programs can take you further.
Let's kick things off by taking a peek at our memory usage with process.memoryUsage()
, updating every second:
setInterval(() => { console.log('Memory Usage:', process.memoryUsage()); }, 1000);
Since the output is in bytes, it's not user-friendly. Let's spruce it up by formatting the memory usage into MB:
function formatMemoryUsageInMB(memUsage) {
return {
rss: convertToMB(memUsage.rss),
heapTotal: convertToMB(memUsage.heapTotal),
heapUsed: convertToMB(memUsage.heapUsed),
external: convertToMB(memUsage.external)
};
}
const convertToMB = value => {
return (value / 1024 / 1024).toFixed(2) + ' MB';
};
const logInterval = setInterval(() => {
const memoryUsageMB = formatMemoryUsageInMB(process.memoryUsage());
console.log(`Memory Usage (MB):`, memoryUsageMB);
}, 1000);
Now, we can get the following output every second:
Memory Usage (MB): {
rss: '30.96 MB', // The actual OS memory used by the entire program, including code, data, shared libraries, etc.
heapTotal: '6.13 MB', // The memory area occupied by JS objects, arrays, etc., dynamically allocated by Node.js
// V8 divides the heap into young and old generations for different garbage collection strategies
heapUsed: '5.17 MB',
external: '0.39 MB'
}
Memory Usage (MB): {
rss: '31.36 MB',
heapTotal: '6.13 MB',
heapUsed: '5.23 MB',
external: '0.41 MB'
}
We all know that the V8 engine's memory usage is limited, not only by the OS's memory management and resource allocation policies but also by its own settings.
Using os.freemem()
, we can see how much free memory the OS has, but that doesn't mean it's all up for grabs by a Node.js program.
console.log('Free memory:', os.freemem());
For 64-bit systems, Node.js V8's default maximum old space size is around 1.4GB. This means that even if your OS has more memory available, V8 won't automatically use more than this limit.
Tip: This limit can be changed by setting environment variables or specifying parameters when starting Node.js. For example, if you want V8 to use a larger heap, you can use the --max-old-space-size
option:
node --max-old-space-size=4096 your_script.js
This value needs to be set based on your actual situation and scenario. For instance, if you have a machine with a lot of memory, deployed standalone, and you have many small-memory machines deployed in a distributed manner, the setting for this value will definitely differ.
Let's run a test by stuffing an array with data indefinitely until memory overflows and see when it happens.
const array = [];
while (true) {
for (let i = 0; i < 100000; i++) {
array.push(i);
}
const memoryUsageMB = formatMemoryUsageInMB(process.memoryUsage());
console.log(`Memory Usage (MB):`, memoryUsageMB);
}
This is what we get when we run the program directly. After adding data for a bit, the program crashes.
Memory Usage (MB): {
rss: '2283.64 MB',
heapTotal: '2279.48 MB',
heapUsed: '2248.73 MB',
external: '0.40 MB'
}
Memory Usage (MB): {
rss: '2283.64 MB',
heapTotal: '2279.48 MB',
heapUsed: '2248.74 MB',
external: '0.40 MB'
}
#
# Fatal error in , line 0
# Fatal JavaScript invalid size error 169220804
#
#
#
#FailureMessage Object: 0x7ff7b0ef8070
Confused? Isn't the limit 1.4G? Why is it using over 2G? Actually, Node.js's 1.4GB limit is a historical limit of the V8 engine, applicable to early V8 versions and certain configurations. In modern Node.js and V8, Node.js automatically adjusts its memory usage based on system resources. In some cases, it may use much more than 1.4GB, especially when dealing with large data sets or running memory-intensive operations.
When we set the memory limit to 512M, it overflows when rss hits around 996 MB.
Memory Usage (MB): {
rss: '996.22 MB',
heapTotal: '993.22 MB',
heapUsed: '962.08 MB',
external: '0.40 MB'
}
Memory Usage (MB): {
rss: '996.23 MB',
heapTotal: '993.22 MB',
heapUsed: '962.09 MB',
external: '0.40 MB'
}
<--- Last few GCs --->
[22540:0x7fd27684d000] 1680 ms: Mark-sweep 643.0 (674.4) -> 386.8 (419.4) MB, 172.2 / 0.0 ms (average mu = 0.708, current mu = 0.668) allocation failure; scavenge might not succeed
[22540:0x7fd27684d000] 2448 ms: Mark-sweep 962.1 (993.2) -> 578.1 (610.7) MB, 240.7 / 0.0 ms (average mu = 0.695, current mu = 0.687) allocation failure; scavenge might not succeed
<--- JS stacktrace --->
FATAL ERROR: Reached heap limit Allocation failed - JavaScript heap out of memory
In summary, to be more precise, Node.js's memory limit refers to the heap memory limit, which is the maximum memory that can be occupied by JS objects, arrays, etc., allocated by V8.
Does the size of the heap memory determine how much memory a Node.js process can occupy? No! Keep reading.
Can I Put a 3GB File into Node.js Memory?
We saw in the test that the array can only hold a bit over 2GB before the program crashes. So, if I have a 3GB file, can't I put it into Node.js memory all at once?
You can!
We saw an external memory through process.memoryUsage(), which is occupied by the Node.js process but not allocated by V8. As long as you put the 3GB file there, there's no memory limit. How? You can use Buffer. Buffer is a C++ extension module of Node.js that allocates memory using C++, not JS objects and data.
Here's a demo:
setTimeout(()=>{
let buffer = Buffer.alloc(1024 * 1024 * 3000);
}, 3000)
Even if you allocate 3GB of memory, our program is still running smoothly, and our Node.js program has occupied over 5GB of memory because this external memory is not limited by Node.js but by the operating system's limit on memory allocated to threads (so you can't just go wild, even Buffer can run out of memory; the essence is to handle large data with Streams).
In Node.js, the lifecycle of a Buffer object is tied to a JavaScript object. When the JavaScript reference to a Buffer object is removed, the V8 garbage collector marks the object as recyclable, but the underlying memory of the Buffer object is not immediately released. Typically, when the destructor of the C++ extension is called (for example, during the garbage collection process in Node.js), this part of the memory is released. However, this process may not be completely synchronized with V8's garbage collection.
Memory Usage (MB): {
rss: '2392.73 MB',
heapTotal: '2392.57 MB',
heapUsed: '2359.93 MB',
external: '3000.41 MB'
}
Memory Usage (MB): {
rss: '2392.75 MB',
heapTotal: '2392.57 MB',
heapUsed: '2359.94 MB',
external: '3000.41 MB'
}
Memory Usage (MB): {
rss: '2392.75 MB',
heapTotal: '2392.57 MB',
heapUsed: '2359.94 MB',
external: '3000.41 MB'
}
In summary: Node.js memory usage consists of JS heap memory usage (determined by V8's garbage collection) + memory allocation by C++
Why Is the Heap Memory Segregated into New and Old Generations?
The generational garbage collection strategy is highly prevalent in the implementations of modern programming languages! Similar strategies like Generational Garbage Collection can be found in Ruby,.NET, and Java. When garbage collection occurs, it often leads to a "stop the world" situation, which inevitably impacts program performance. However, this design is conceived with performance optimization in mind.
- Divergent Object Lifespans During program development, a significant portion of variables are temporary, serving to fulfill specific local computational tasks. Such variables are better suited for Minor GC, that is, the new generation GC. The objects in the new generation memory are primarily subject to garbage collection via the Scavenge algorithm. The Scavenge algorithm bisects the heap memory into two parts, namely From and To (a classic space-for-time tradeoff. Thanks to their short survival time, they don't consume a large amount of memory).
When memory is allocated, it takes place within From. During garbage collection, the live objects in From are inspected and copied to To, followed by the release of non-live objects. In the subsequent round of collection, the live objects in To are replicated to From, at which point To morphs into From and vice versa. With each garbage collection cycle, From and To are swapped. This algorithm replicates only live objects during the copying process and thereby averts the generation of memory fragments.
So, how is the liveness of a variable determined? Reachability analysis comes into play. Consider the following objects as an example:
-
globalObject
: The global object. -
obj1
: An object directly referenced by globalObject. -
obj2
: An object referenced by obj1. -
obj3
: An isolated object without any references from other objects.
In the context of reachability analysis:
-
globalObject
, being a root object, is inherently reachable. -
obj1
, due to being referenced byglobalObject
, is also reachable. -
obj2
, as it is referenced byobj1
, is reachable as well. - In contrast,
obj3
, lacking any reference paths to the root object or other reachable objects, is adjudged unreachable and thus eligible for recycling.
Admittedly, reference counting can serve as an auxiliary means. Nevertheless, in the presence of circular references, it fails to accurately ascertain the true liveness of objects.
In the old generation memory, objects are generally less active. However, when the old generation memory becomes full, it triggers the cleanup of the old generation memory (Major GC) through the Mark-Sweep algorithm.
The Mark-Sweep algorithm comprises two phases: marking and sweeping. In the marking phase, the V8 engine traverses all objects in the heap and tags the live ones. In the sweeping phase, only the unmarked objects are cleared. The merit of this algorithm is that the sweeping phase consumes relatively less time since the proportion of dead objects in the old generation is relatively small. However, its drawback is that it only clears without compacting, which may result in a discontinuous memory space, making it inconvenient to allocate memory for large objects.
This shortcoming gives rise to memory fragmentation, necessitating the employment of another algorithm, Mark-Compact. This algorithm shifts all live objects to one end and then eradicates the invalid memory space on the right side of the boundary in one fell swoop, thereby obtaining a complete and continuous available memory space. It resolves the memory fragmentation issue that might be caused by the Mark-Sweep algorithm, albeit at the cost of consuming more time in moving a large number of live objects.
If you find this post useful, please give it a thumbs up. :D
Top comments (0)