DEV Community

FEConf
FEConf

Posted on • Edited on

A Guide to Debugging Memory Leaks in SSR Environments (Part 2)

This article summarizes the talk A Guide to Debugging Memory Leaks in SSR Environments (Node.js) presented at FEConf 2023. The content will be published in two parts.

In Part 1, we explored what memory leaks are and how to detect them using monitoring tools.

In Part 2, we’ll walk through the process of debugging an actual memory leak and discuss how to resolve it.

All images in this article are from the presentation slides of the same title; therefore, separate attributions are not provided.
You can download the presentation materials from the FEConf 2023 website.

‘Debugging Memory Leaks in SSR Environments (Node.js)’ presented at FEConf2023 / Jihye Park, Frontend Engineer at Toss Place

In this article, we'll get hands-on with debugging and fixing the memory leak issues we covered in Part 1.

Solving Memory Leaks

In our previous elevator analogy, we looked at two ways to solve memory leaks.

  1. Increase heap memory, or
  2. Debug to find the culprit of the memory leak.

Let's take a closer look at these two methods.

Increasing Heap Memory

First, let's explore the method of increasing memory. The thought process of 'We're running out of memory, so let's just add more' is pretty natural. However, will simply increasing heap memory resolve the memory leak?

Image description

Unfortunately, no. Even if you bump up the heap memory, this code will keep leaking memory and eventually bring down your server. Why is that? You can understand it if you know how Node.js's V8 engine manages memory. The V8 engine uses an algorithm called 'Mark and Sweep' to manage memory effectively. This means it marks what's in use and sweeps away (cleans up) what's not.

Image description

Data types like arrays, objects, and functions all get their memory allocated from the heap. For simplicity, let's refer to all of these as 'objects'. The garbage collector recursively checks from the root whether these objects are being used. It then collects unnecessary objects that are no longer in use to free up memory space. This is the 'Mark and Sweep' algorithm.

But if an object is being referenced from somewhere—basically if something is still 'holding onto' it—it stays in heap memory. What happens if objects persist like this? To understand this, we need to know about heap memory.

Heap Memory

Node.js's V8 engine divides memory into zones for better management. Below is a simplified structure of the garbage collector to explain the V8 engine's lifecycle.

Image description

It's primarily divided into the Young Generation and the Old Generation, and the garbage collectors are divided into Minor GC (Scavenger) and Major GC (Mark-Sweep & Mark-Compact). If an object is newly declared, it's typically allocated memory in an area called the 'nursery' (within the Young Generation). If the object survives one garbage collection cycle, it moves to an 'intermediate' or 'survivor' space (still often within Young Generation). If it survives another garbage collection cycle, the object eventually moves to the Old Generation area. According to V8's documentation, very few objects actually make it this far.

So, what happens if more objects survive and accumulate in this Old Generation area? The V8 engine operates the application by managing these two generations. However, since heap memory has a limited capacity, it will eventually fill up, causing the server to crash.

Let's revisit our previous example code.

Image description

The listItems array, being declared as a global variable, isn't collected by the garbage collector and ends up residing in the Old Generation. Initially, it might occupy a small area, like in a conceptual first diagram. Then, when the server receives requests, the loop runs a million times, increasing the length of listItems. If this repeats, it will occupy a large amount of space, as might be shown in a second conceptual diagram. Eventually, a moment comes when the heap memory is full. Then, the server crashes.

The example above is simple, but when you encounter such problems while writing real code, will merely increasing heap memory solve the issue? When you face such problems and search online, you'll easily find answers suggesting you increase max-old-space-size, like the one shown below.

Image description

Here, max-old-space-size is an option to adjust the capacity of Node.js's Old Generation. In other words, since most objects causing memory leaks reside in the Old Generation, you'll often find advice to increase the Old Generation's capacity, based on the garbage collector structure explained earlier.

Image description

There are various other causes of memory leaks besides the global variables mentioned. Common examples include not clearing setTimeout or setInterval timers, and closures. In the case of closures, situations where objects or variables are declared and referenced within an execution context, leading to further references, can also result in significant heap memory allocation.

In these various scenarios, situations arise where a very large amount of heap memory is needed, so indiscriminately increasing heap memory might not always be the solution.

Debugging Memory Leaks

Debugging Method

Now, let's explore solutions through debugging. If you run Node.js using the inspect option, like node --inspect index.js, and open your browser's developer tools, you'll see a green Node.js icon. I primarily use Chrome's inspect menu (accessible via chrome://inspect). This menu lists currently running local servers, and you can select the desired server to open its inspect window.

Image description

When you open the inspect window for debugging, you'll see a window like the one below. In the left panel (typically the "Memory" tab), there's a profiling record button (often a circle). To check memory usage for a specific period, you need to start and stop recording, and this button allows you to do just that.

Image description

Next to it is a 'clear all profiles' button (often a circle with a line through it, or individual trash can icons for profiles), and below that is the list of completed profiling result files. The 'clear all profiles' button deletes all profiling recording result files.

Image description

Finally, there's a trash can icon button, which manually triggers the garbage collector. Typically, before starting memory profiling, you trigger the garbage collector to stabilize the memory state and then begin profiling.

Image description

Next, let's look at the most important area. Chrome supports three profiling types in the Memory tab:

First, Heap snapshot. This type records the heap memory usage at a specific moment. If you select this type, the Take snapshot button below becomes active. Clicking it captures the heap memory state at that instant. If you have code that you've significantly improved in terms of memory or performance, you can record snapshots before and after the change to compare the two. This comes in handy when you have a pretty good idea where the leak might be happening and want to zero in on that specific area.

Image description

Second is Allocation instrumentation on timeline. This is a very useful and frequently used feature. It periodically records heap memory and shows how much heap memory is being used over time as a graph during the recording. When you suspect a memory leak and start debugging, you can use this timeline to observe how memory usage changes over time. It's often difficult to pinpoint where a memory leak is occurring, and this feature is very helpful in such cases.

The final type is Allocation sampling. This is similar to the second one but is mainly used when you need to record for a much longer duration. Recording every moment can cause overhead, so this method debugs using sampled information over a longer period. When you press the record button, it might not look like much is happening, but it is recording. When you stop recording, it shows the sampled information.

We've briefly looked at these three debugging methods. You can choose the type that best suits your situation, but the second type (Allocation instrumentation on timeline) is generally used most often. When you encounter a memory leak error, starting your debugging with the second type will likely help you identify the problem area quickly.

Finding the Memory Leak Culprit

When you start debugging using the second method described earlier (Allocation instrumentation on timeline), a graph appears at the top. This graph shows how much heap memory Node.js is using while requests are being processed.

Image description

The height of the graph at any point represents the total amount of heap memory allocated at that moment. Grey areas indicate memory reclaimed by the garbage collector, while blue areas represent memory currently occupied on the heap. So, if there's a lot of persistent blue, it could indicate a significant leak. If there's a lot of grey activity, memory is being managed.

You can also drag to select a specific section of the graph to view only that interval. It's a good idea to first examine the entire range and then select areas with a persistent blue graph. You might find common objects appearing consistently, like an intersection. Focusing your debugging on these areas can save time.

The graph makes it easy to see when a memory leak is occurring, but it doesn't tell you who the culprit is. There's a lot to know to find the culprit, but the following concepts are essential:

Image description

These are Shallow Size and Retained Size. Shallow Size is the size of the object itself in bytes. Retained Size is the total size of memory that would be freed if the object itself were deleted, including all objects it exclusively references (and so on, recursively). Additionally, there's a metric called Distance, which indicates how far an object is from the garbage collector's root. A larger distance value can suggest a higher likelihood of being part of a memory leak. It's more of a supplementary indicator rather than a precise debugging metric, but it's useful for quick reference.

In our previous example code, the globally declared listItems array was referenced within a function. The garbage collector couldn't reclaim this variable, so it continued to occupy heap memory. The function itself is simple, but within the actual execution context, this variable can grow to a very large size, requiring a significant amount of heap memory. In other words, its Retained Size can be much larger than its Shallow Size. You should focus your debugging on objects or variables where the Retained Size is significantly larger than the Shallow Size.

Image description

In the inspect menu for the code we wrote earlier, if you sort by Retained Size in descending order, you can see the memoryLeakFunction I created. Select this object, and in the Retainers tab below, you can see the chain of references that keeps this object allocated in heap memory. You can click on filenames to see which file and what code is involved. You can also identify paths to specific libraries you're using.

Image description

Scrolling down (in the Retainers view or object list), you'll see listItems. If you were to look at its details (conceptually, as in the image the presenter might show), you'd see that listItems' Retained Size is much larger than its Shallow Size. It can be a tedious process, but by finding and modifying these objects one by one to reduce these numbers, you can find the culprit of the memory leak and resolve it.

The image below is an inspect screen from my actual experience. After filtering out unnecessary parts, I found the section shown. Usually, you'd see a path like node_modules, but in my case, since I manage Node packages with Yarn Berry, you can see a .yarn path. I was able to identify which file in this path was causing the issue, confirmed the memory leak originated there, fixed it, and deployed the changes, which resolved the memory leak.

Image description

Before the fix, we had that sawtooth pattern I mentioned earlier, but after deployment, you can see it flattened out into a nice, stable line. It has maintained a state without memory leaks since then.

Image description

using - A Keyword to Ease the Pain

Lastly, there's a keyword I'd like to introduce: using. If you've worked with C#, this will look familiar. Python has something similar too (the with statement). Many of you might have seen it with the recent announcement of TypeScript 5.2, but it's not actually a new concept.

Image description

It's an existing concept in C#, and it has already reached Stage 3 in JavaScript's TC39 (the group that manages JavaScript standards), so we might see it in native JavaScript soon.

Simply put, if you declare a variable with using instead of var, let, or const, an object's [Symbol.dispose]() or [Symbol.asyncDispose]() method (if it implements one) will be called at the end of the variable's scope, allowing for cleanup. You can remove declared event listeners, release DB connections if one was made, or manage the lifecycle of resources like streams that need to be connected and then disconnected.

Image description

If we modify the previous example code to use using, it looks like this. We declared using instead of const, and to work with using, the object would need to implement the Disposable interface (i.e., have a [Symbol.dispose]() method). The function's content is the same, and a cleanup part has been added via the dispose method.

Image description

If you run this (conceptual) code, unlike before, you'll see the heap memory usage remain stable. By utilizing using in this manner, it seems we can write code that avoids memory leaks.

Wrapping Up

Let me conclude by summarizing what we've discussed. We've looked at debugging methods for server and client environments, how to use the inspect option in server environments and profile heap memory usage with the timeline, how to find objects causing memory leaks by comparing Shallow Size and Retained Size, and finally, how to potentially prevent memory leaks using the upcoming using keyword or similar explicit resource management patterns.

Image description

I hope that when you run into memory leaks in your day-to-day work, these techniques will help you track them down, figure out what's causing them, and make your apps run better.

Thank you.

Top comments (0)