DEV Community

Cover image for Avoiding "Out of Memory" Errors: Strategies for Efficient Heap Dump Analysis
JillThornhill
JillThornhill

Posted on

Avoiding "Out of Memory" Errors: Strategies for Efficient Heap Dump Analysis

Most of us have seen this unwelcome exception in Java: java.lang.OutOfMemoryError. In testing, this is annoying. In production, it can be disastrous.

In live systems, we’re likely to see degraded performance followed by an OutOfMemory crash. The repercussions can include unhappy customers, overwhelmed support staff, loss of revenue and shouting managers. This is not anyone’s idea of a happy day.
Can OutOfMemoryErrors be avoided? How?

This article answers those questions.

Too many administrators just increase the heap size haphazardly. Instead, we’ll look at how to use a memory analyzer throughout the project lifecycle to make informed decisions.

Why Shouldn’t We Just Increase Heap Size?

Here are a few good reasons why simply increasing the heap size is not a good idea.

  • If we have a memory leak, increasing the size will be at best a very temporary solution, since memory usage will eventually exceed the new limit.

  • Not all memory problems relate to the heap.

  • Increasing the heap size may result in depriving other areas of the JVM, or even the device as a whole, of memory.

  • Several other factors, such as the choice of garbage collection algorithms, could be the root cause of the problem. Increasing the heap size would not solve these problems.

So, what’s the solution?

Firstly, if the heap runs out of memory, we need to use a heap dump analyzer such as HeapHero or Eclipse MAT to examine the heap and discover the cause of the problem. Only then can we figure out how to solve the real problem and prevent it from recurring.

But more importantly, we need to proactively prevent memory problems by making performance analysis an integral part of each phase of the software lifecycle.

Using Memory Analyzers Throughout the Project Lifecycle.

Don’t wait until it breaks.

Let’s see how we can use memory analyzers, and other diagnostic tools, to make sure we never encounter another java.lang.OutOfMemoryError in production.

1. The Planning Stage

Innovative ideas are great. However, all great inventors build prototypes, because it’s seldom possible to predict how things will actually work in real life.

It’s good to get into the habit of building a small trial program to see how much memory we might need to implement our designs. This is especially important if we’re designing for small devices, or for containers. Also, if we’re designing for the cloud, memory usage has a direct impact on monthly running costs.

This is where tools like HeapHero come in. While the program is running, we would dump the heap, as described in these articles:

We can then explore the heap using a heap dump analyzer. You may like to watch this video to see a demonstration of how to analyze a dump with HeapHero.

This takes the guesswork out of design, and lets us evaluate different strategies and make trade-offs as needed. It also gives us a good idea of how much memory the final solution may require, so we can accurately specify the hardware and carry out a cost-benefit analysis.

2. The Development Stage

At the development stage, we can use a memory analyzer in many ways.

  • Comparing different coding techniques for efficiency.

  • Checking for memory wastage.

  • Confirming that actual memory requirements match the original hardware specifications.

  • Providing memory usage statistics in order to accurately configure the heap size at the testing stage.

3. The Testing Stage

At each phase of testing, we should use a memory analyzer to confirm that memory usage is reasonable, that we have no memory leaks, and obtain more accurate memory usage statistics.

When we come to testing the system as a whole, and particularly in performance labs and stress tests, we should also look at monitoring garbage collection (GC) behavior.

Efficient GC is critical for preventing memory problems. We can use a GC log analysis tool such as GCeasy to make sure the GC is doing its job, and that memory is staying well within configuration limits.

We can also recognize memory leaks and other potential memory issues by analyzing GC patterns. For example, the image below compares the pattern of a healthy application to one with a memory leak. In the memory leak pattern, although GC is clearing some memory on each cycle, it never manages to fully clear it back to the same level. If the memory leak is not resolved, usage will keep growing until we run out of heap memory.

Fig: Memory Usage Graphs from GCeasy: Healthy Application vs Memory Leak

Before we begin performance lab testing, we should think about automating some aspects of memory monitoring, and setting alerts if any problems are detected. There are two ways we can do this:

  • Both HeapHero and GCeasy offer REST APIs that can be included in testing scripts to automatically analyze the heap dump and the GC logs. They return key information that can be used to flag any issues found.

  • We can use a monitoring and analysis system such as yCrash. This tool samples the JVM regularly to analyze micrometrics that predict any developing performance issues.

By including performance analysis in our testing, we can identify and fix any bottlenecks, and also provide detailed statistics that can be used for accurate configuration in the live system.

4. Preparing to Go Live

At this stage, we should use memory statistics produced during testing to accurately configure the heap.

We should also set in place monitoring procedures for the live system, so that we will be alerted if performance indicators begin to drop.

5. In Production

Monitoring heap dump usage and GC logs regularly can prevent problems escalating to the point where performance is affected. System usage is likely to grow with time. If we see memory usage increasing, we can alter the JVM configuration to cater for growth.

GC logging should always be enabled in production, as it uses very little overhead. We can then periodically submit logs to a GC log analyzer.

We could set up heap dump extract and analysis as a CRON job that runs periodically. This shouldn’t happen too often, because dumping the heap can take time and reduce performance. It should be scheduled off-peak.

Alternatively, we can set up the yCrash monitor to continuously sample performance, and raise alerts as needed.

6. In CI/CD Pipelines

Incorporating HeapHero and GCeasy REST APIs into the build pipeline is an effective way to prevent new versions introducing memory issues.

We can use them to automatically compare performance indicators with previous builds, and fail the build if we detect unacceptable changes.

The indicators we can monitor include:

  •  Object Creation Rate;

  • Heap size;

  • Class Count;

  • Object count;

  • Throughput;

  •  Latency;

  • Memory Wastage Percentage.

Conclusion

We’ve looked at incorporating memory analyzers in all stages of the project lifecycle.

Using this strategy, we can either increase the heap size or optimize the code long before issues result in an OutOfMemoryError. Even if we haven’t done this with existing systems, it’s never too late to start. Including memory monitoring, both in production and in future development cycles, saves a lot of heartache.

Prevention is definitely better than cure.

Top comments (0)