DEV Community

jvmind
jvmind

Posted on

Virtual Thread OOM – A Case Study in Missing Backpressure

TL;DR: A missing Semaphore in a virtual thread stress test led to 1,076 MB/s allocation rate, 324 Full GCs releasing 0 bytes, and 150k new virtual threads in 1 second. Virtual threads are powerful, but submit() is non-blocking – you must manage the creation rate.

The Incident

A stress test using JDK 26 virtual threads (Loom) ran for ~134 seconds before OOM. The application was using Executors.newVirtualThreadPerTaskExecutor() to submit short-lived tasks.

GC Report – What the Logs Showed

Metric Value Severity
Heap Size 2 GB (G1)
Allocation Rate 1,076 MB/s 2GB heap filled in <2 seconds
Total GC Events 607 ~4.5 GCs per second
Full GC Count 324 (53%) More Full GCs than Young GCs
Full GC Total Pause 118,114 ms 88% of runtime spent in Full GC
Application Throughput ~10% JVM barely ran the app
Top Full GC Releases 2047MB → 2047MB 0 bytes released

The "0 bytes released" pattern across 324 Full GCs is the critical signal: everything in the heap was alive.

Thread Dump Analysis

Two jstack dumps taken 1 second apart:

  • jstack-1: Virtual thread #3342482
  • jstack-2: Virtual thread #3492195
  • +150,000 virtual thread IDs in 1 second

Carrier thread stack:

ForkJoinPool-1-worker-1 (daemon)
  → Carrying virtual thread #3342482 / #3492195
     at ConcurrentHashMap.sumCount / isEmpty
     at ThreadPerTaskExecutor.tryTerminate / taskComplete
     at VirtualThread.run
Enter fullscreen mode Exit fullscreen mode

Thread state distribution: 64 total platform threads, 49 carrying virtual threads, 0 BLOCKED, 0 deadlocks – consistent with a virtual thread explosion.

The Offending Code

private void submitTasks(int threadCount, int workMs) {
    try (ExecutorService executor = Executors.newVirtualThreadPerTaskExecutor()) {
        while (running) {  // ← no sleep, no rate limiting
            for (int i = 0; i < threadCount; i++) {  // ← 10,000 per round
                executor.submit(() -> {  // ← non-blocking, returns instantly
                    Thread.sleep(workMs);  // ← each thread lives ≥10ms
                    // do work
                });
            }
        }
    }
}
Enter fullscreen mode Exit fullscreen mode

The cascade:

  1. ~100-200 loop iterations/sec × 10,000 threads = ~1-2 million submit() calls/sec
  2. Each virtual thread carries Continuation stack + ThreadLocal + task object
  3. Each thread lives at least 10ms → unbounded accumulation
  4. 2GB heap fills in <2 seconds
  5. Young GCs (205) release ~8MB avg → can't keep up
  6. G1 falls back to Full GC → 324 Full GCs, all release 0 bytes
  7. JVM spends 88% of time in GC → OOM

Why Full GC Released 0 Bytes

Full GC traces:

[Full GC (G1 Evacuation Pause) 2047M->2047M, 0.362s]
[Full GC (G1 Evacuation Pause) 2047M->2047M, 0.371s]
[Full GC (G1 Evacuation Pause) 2047M->2047M, 0.358s]
Enter fullscreen mode Exit fullscreen mode

All virtual threads and Continuations were alive – referenced by ForkJoinPool and the ThreadPerTaskExecutor. With all objects reachable from GC roots, the collector had nothing to reclaim.

The Fix – Adding Backpressure

Option 1: Semaphore (Recommended)

Semaphore sem = new Semaphore(maxPending);
while (running) {
    for (int i = 0; i < threadCount; i++) {
        sem.acquire();  // blocks when limit exceeded
        executor.submit(() -> {
            try { /* do work */ } finally { sem.release(); }
        });
    }
}
Enter fullscreen mode Exit fullscreen mode

Option 2: Add Sleep in the Loop

while (running) {
    for (int i = 0; i < threadCount; i++) {
        executor.submit(...);
    }
    Thread.sleep(workMs);  // rate limit submissions
}
Enter fullscreen mode Exit fullscreen mode

Option 3: Bounded Queue + CallerRunsPolicy

BlockingQueue<Runnable> queue = new ArrayBlockingQueue<>(10000);
ThreadPoolExecutor executor = new ThreadPoolExecutor(
    0, Integer.MAX_VALUE, 60, TimeUnit.SECONDS,
    queue,
    Thread.ofVirtual().factory(),
    new ThreadPoolExecutor.CallerRunsPolicy()  // throttles the submitter
);
Enter fullscreen mode Exit fullscreen mode

Key Lessons

  1. Virtual threads are not fire-and-forgetsubmit() is non-blocking by design, which means you must manage the creation rate
  2. Full GC releasing 0 bytes is a strong diagnostic signal – it almost always means thread explosion or a massive live object graph
  3. The virtual thread IDs in jstack don't lie – 150k new IDs in 1 second is a clear warning
  4. Backpressure is not optional – without it, any fast producer can overwhelm the JVM

Data Mapping – GC + Threads + Code

GC/Thread Symptom Root Cause in Code
Allocation rate 1,076 MB/s while no sleep + non-blocking submit()
324 Full GCs, 0 bytes released All virtual threads alive, GC roots reachable
150k new threads in 1 second ~1M submit() calls per second
Throughput ~10% 88% of time in Full GC

Tool Note

Both analyses were performed using a JVM analysis tool I'm building – it parses GC logs, correlates with thread dumps, and extracts root cause patterns. The tool helped identify these issues in minutes rather than hours.

Top comments (0)