DEV Community

Cover image for Beyond Leak Suspects: Advanced Techniques for Deep Dive Memory Leak Analysis
JillThornhill
JillThornhill

Posted on

Beyond Leak Suspects: Advanced Techniques for Deep Dive Memory Leak Analysis

Memory leaks can cause havoc in production. If they don’t actually make the program crash, they can degrade response times and performance drastically.

Most memory leaks can be found fairly easily with a good heap dump analyzer, since the tool instantly highlights memory leak suspects. This is a great starting point, but to troubleshoot elusive issues, we need to go a little deeper.

This article discusses what we can do if the leak suspect chart doesn’t give us the answers we need.

What is a Memory Leak, and What is a Heap Dump?

A memory leak occurs when unnecessary objects remain in memory, in spite of the best efforts of the garbage collector (GC). The diagram below shows the difference between a healthy GC memory pattern, and a pattern indicating a memory leak.

Image description

Fig: Healthy GC vs Memory Leak Pattern

In the first chart, GC is consistently bringing down memory usage. In the second, although GC temporarily reduces memory congestion, memory usage is building up over time, until the system becomes unresponsive and may finally crash.

A memory dump is a snapshot of memory at a given moment. Since it’s a very large binary file, we need tools to analyze it for memory leaks. In this article, we’ll be using the popular HeapHero  to illustrate this.

The Object Reference Tree

The best tools identify the largest objects in memory as leak suspects. This is because we’ll almost always find the cause of the leak by examining the top three or four biggest objects. Unfortunately, this may not instantly let us identify the actual cause of the leak.

An object reference tree is a tree structure that records the parent and child objects of each item in the heap, as illustrated in the diagram below.

Image description

Fig: Object Reference Tree

We can see that Object A created Objects B, C and D; Object B created Objects E and F; and Object D created Object G.

The Retained Heap is the total heap space occupied by an object and its children. The Shallow Heap is the space occupied by the object on its own.

If Object B is identified as a leak suspect, the problem may actually be found amongst its children, E and F. We can determine whether the problem is likely to be found in the object itself, or in its children, by comparing the size of the shallow heap to the retained heap.

The problem could also be that its parent, Object A, is not releasing it once it’s no longer needed. To find the leak, we need to browse up and down the object reference tree. Fortunately, with HeapHero, this is simple.

For a demonstration, see the video Finding Memory Leaks with HeapHero. We can move up and down the object tree by requesting either incoming or outgoing references, as shown in the screenshot below.

Image description
Fig: Scrolling Through the Object Reference Tree

Analyzing a Heap Dump for Memory Leaks: Other Considerations

We need to bear in mind that the memory leak may be outside of heap memory. For example, if a bug in the program causes a loop in a method that creates classes dynamically, we could cause an Out of Memory error in the Metaspace.

The HeapHero report contains a class histogram, listing the number of instances of each class, as shown in the screenshot below.

Image description

Fig: Class Histogram

In this histogram, the class jdk.internal.ref.Cleaner has nearly four thousand instances, which could well be the symptom of a leak.

Memory problems may not be an actual leak, but may simply be caused by inefficient code. Duplicate strings, poorly-planned collections and boxed numbers can all cause performance problems. The HeapHero tool reports on any inefficiencies it detects. In the report below, the program had an excessive number of duplicate strings.

Image description

Fig: Duplicate Strings

Finally, the problem may simply be that either the heap or the GC may be incorrectly configured. If we’re satisfied that the program code is not the cause of the problem, we can consider increasing the heap size, or using a different GC algorithm.

Conclusion

When analyzing a heap dump for memory leaks, HeapHero offers so much more than just identifying leak suspects. We can interactively browse the object reference tree, see ML suggestions for improvement, investigate the class histogram and check for wasted memory.

Troubleshooting obscure memory leaks need no longer be a frustrating, time-consuming chore.

Top comments (0)