After running a Rails application in production for a while, you may encounter a phenomenon where memory usage grows unexpectedly large. In July 2025, I investigated the causes of this behavior and explored countermeasures. Here is a summary of my findings.
- The reason a Rails app "appears to keep consuming memory" is that glibc, which Ruby relies on, retains freed memory internally for future reuse rather than returning it to the OS. This is distinct from a typical memory leak.
- Starting with Ruby 3.3.0, you can optimize the heap by calling
Process.warmupafter the Rails application has finished booting. However, since this mechanism is intended to be executed at the completion of the application boot sequence, it is not easily applicable to reducing memory in long-running Rails applications. - As a countermeasure that requires zero changes to product code, setting the environment variable
MALLOC_ARENA_MAX=2remains effective. This prevents glibc from creating numerous arenas (memory pools) one after another, forcing memory reuse within existing pools and thereby preventing glibc from hoarding too much freed memory. - Switching to the memory allocator jemalloc, which was previously recommended as an effective countermeasure, should now be avoided because jemalloc is no longer being maintained.
Why Does Memory Usage in Rails Apps Appear to Keep Growing?
Hongli Lai's article "What causes Ruby memory bloat?" covers this in detail.
According to that article, the reasons memory bloat "appears to occur" are as follows:
- The previously cited explanation of "heap page fragmentation on the Ruby side" was not, in fact, a major contributor to increased memory usage.
- The true cause was that "glibc's memory allocator, malloc, retains memory that Ruby has freed instead of returning it to the OS, holding onto it for future use." In particular, free pages that are not at the end of the heap are not returned to the OS, so unused memory continues to accumulate internally. From the OS's perspective, this makes it look like "Ruby keeps consuming memory."
- Calling glibc's
malloc_trim(0)ensures that memory freed by Ruby is returned to the OS, effectively reducing the process's memory usage (RSS) as seen by the OS.
However, there is an extremely important caveat here:
- Memory that Ruby has allocated and freed during processing is usually fragmented. Calling
malloc_trim(0)does not resolve the fragmentation; it merely returns the fragmented regions to the OS as-is. - Even though the memory is fragmented, it is still returned to the OS, so Ruby's memory usage (RSS) as seen by the OS does decrease. However, because other programs cannot allocate contiguous regions from fragmented free memory, an OOM (Out of Memory) error can occur even when there appears to be free memory available.
- Since returning fragmented memory to the OS does not make it easy to reuse effectively, malloc is designed to retain allocated but unused memory internally and reuse it, enabling stable allocation.
This is one of the reasons "Ruby has freed the memory, but malloc does not readily return it to the OS."1
What Is the Memory Bloat Countermeasure Code Introduced in Ruby 3.3.0?
Ruby 3.3.0 introduced the Process.warmup method. This method is intended to signal to the Ruby virtual machine from an application server or similar that "the application's startup sequence has completed, making this an optimal time to perform GC and memory optimization."2
When Process.warmup is called, the Ruby virtual machine performs the following optimization operations:
- Forces a major GC
- Compacts the heap
- Promotes all surviving objects to the old generation
- Pre-computes string coderanges (to speed up future string operations)
This cleans up objects and caches that were generated during application startup but are no longer needed, improving memory sharing efficiency in Copy-on-Write (CoW) environments.
Furthermore, since unnecessary objects have been garbage collected and the heap has been compacted, there is a high likelihood that fragmentation in the heap allocated by malloc has been reduced. This makes it an ideal time to call malloc_trim(0), and a patch that calls malloc_trim(0) internally within Process.warmup has been merged.
Process.warmup: invoke `malloc_trim` if available
#8451
Similar to releasing free GC pages, releasing free malloc pages reduce the amount of page faults post fork.
NB: Some popular allocators such as jemalloc don't implement it, so it's a noop for them.
An important point is that Process.warmup is not automatically called behind the scenes like GC. It is the kind of method that should be explicitly called at an appropriate time on the application server side when a major GC would be acceptable (e.g., before forking, before worker startup). Therefore, there may not always be an appropriate time to call it in long-running Rails applications.
Countermeasures for Memory Bloat in Long-Running Rails Apps
So how can you suppress memory bloat without using Process.warmup or malloc_trim(0)?
Online resources have recommended using jemalloc, a smarter memory allocator. However, jemalloc's repository was archived in June 2025, and ongoing maintenance cannot be expected. It is best to avoid adopting it for new projects.
As an alternative, setting the environment variable MALLOC_ARENA_MAX=2 remains effective. Here's why:
- It reduces the number of arenas (memory management regions) that glibc allocates. glibc's malloc allocates numerous arenas as needed to prevent contention when multiple threads request memory simultaneously (normally, on 64-bit systems, the upper limit is 8 times the number of vCPU cores on the machine).
- As described above, glibc's memory allocator is reluctant to return memory to the OS. Therefore, the more arenas there are, the more "unreturned free memory" accumulates internally.
- By limiting the number of arenas, you can reduce the total amount of memory that glibc does not return to the OS (at the cost of slightly increased contention for memory allocation among threads).
According to the following articles, setting MALLOC_ARENA_MAX=2 significantly reduces memory usage in exchange for a slight degradation in response time of a few percent.
https://www.speedshop.co/2017/12/04/malloc-doubles-ruby-memory.html
https://techracho.bpsinc.jp/hachi8833/2022_06_23/50109
Additionally, MALLOC_ARENA_MAX=2 is the default setting on Heroku, which suggests it is a relatively safe configuration.
https://devcenter.heroku.com/changelog-items/1683
Furthermore, the following reasoning supports why a value of 2 for MALLOC_ARENA_MAX is sufficient:
- Ruby has a GVL (Global VM Lock), which means only one thread can execute Ruby code at any given time. Therefore, even if the application spawns many threads, the number of threads actively running and allocating memory at any point is expected to be very small. Consequently, glibc does not need to maintain many arenas; a small number (around
2) sufficient to handle requests from active threads should be adequate. For this reason, settingMALLOC_ARENA_MAXto2causes virtually no operational issues in practice, while effectively minimizing the total amount of free memory hoarded across multiple arenas.
If you want to be thorough, you should measure memory usage and response time with each setting—unset, 2, 3, and 4—and determine the optimal value.
-
A proposal was made to "call
malloc_trim(0)when a full GC is performed in Ruby to return memory to the OS," but it was not implemented because returning fragmented memory to the OS provides little benefit since the OS cannot effectively utilize it. Feature #15667: Introduce malloc_trim(0) in full gc cycles - Ruby - Ruby Issue Tracking System ↩ -
The background behind the introduction of
Process.warmupis explained in Feature #18885: End of boot advisory API for RubyVM - Ruby - Ruby Issue Tracking System ↩
Top comments (0)