TL;DR: A missing Semaphore in a virtual thread stress test led to 1,076 MB/s allocation rate, 324 Full GCs releasing 0 bytes, and 150k new virtual threads in 1 second. Virtual threads are powerful, but submit() is non-blocking – you must manage the creation rate.
The Incident
A stress test using JDK 26 virtual threads (Loom) ran for ~134 seconds before OOM. The application was using Executors.newVirtualThreadPerTaskExecutor() to submit short-lived tasks.
GC Report – What the Logs Showed
| Metric | Value | Severity |
|---|---|---|
| Heap Size | 2 GB (G1) | – |
| Allocation Rate | 1,076 MB/s | 2GB heap filled in <2 seconds |
| Total GC Events | 607 | ~4.5 GCs per second |
| Full GC Count | 324 (53%) | More Full GCs than Young GCs |
| Full GC Total Pause | 118,114 ms | 88% of runtime spent in Full GC |
| Application Throughput | ~10% | JVM barely ran the app |
| Top Full GC Releases | 2047MB → 2047MB |
0 bytes released |
The "0 bytes released" pattern across 324 Full GCs is the critical signal: everything in the heap was alive.
Thread Dump Analysis
Two jstack dumps taken 1 second apart:
- jstack-1: Virtual thread #3342482
- jstack-2: Virtual thread #3492195
- +150,000 virtual thread IDs in 1 second
Carrier thread stack:
ForkJoinPool-1-worker-1 (daemon)
→ Carrying virtual thread #3342482 / #3492195
at ConcurrentHashMap.sumCount / isEmpty
at ThreadPerTaskExecutor.tryTerminate / taskComplete
at VirtualThread.run
Thread state distribution: 64 total platform threads, 49 carrying virtual threads, 0 BLOCKED, 0 deadlocks – consistent with a virtual thread explosion.
The Offending Code
private void submitTasks(int threadCount, int workMs) {
try (ExecutorService executor = Executors.newVirtualThreadPerTaskExecutor()) {
while (running) { // ← no sleep, no rate limiting
for (int i = 0; i < threadCount; i++) { // ← 10,000 per round
executor.submit(() -> { // ← non-blocking, returns instantly
Thread.sleep(workMs); // ← each thread lives ≥10ms
// do work
});
}
}
}
}
The cascade:
- ~100-200 loop iterations/sec × 10,000 threads = ~1-2 million
submit()calls/sec - Each virtual thread carries Continuation stack + ThreadLocal + task object
- Each thread lives at least 10ms → unbounded accumulation
- 2GB heap fills in <2 seconds
- Young GCs (205) release ~8MB avg → can't keep up
- G1 falls back to Full GC → 324 Full GCs, all release 0 bytes
- JVM spends 88% of time in GC → OOM
Why Full GC Released 0 Bytes
Full GC traces:
[Full GC (G1 Evacuation Pause) 2047M->2047M, 0.362s]
[Full GC (G1 Evacuation Pause) 2047M->2047M, 0.371s]
[Full GC (G1 Evacuation Pause) 2047M->2047M, 0.358s]
All virtual threads and Continuations were alive – referenced by ForkJoinPool and the ThreadPerTaskExecutor. With all objects reachable from GC roots, the collector had nothing to reclaim.
The Fix – Adding Backpressure
Option 1: Semaphore (Recommended)
Semaphore sem = new Semaphore(maxPending);
while (running) {
for (int i = 0; i < threadCount; i++) {
sem.acquire(); // blocks when limit exceeded
executor.submit(() -> {
try { /* do work */ } finally { sem.release(); }
});
}
}
Option 2: Add Sleep in the Loop
while (running) {
for (int i = 0; i < threadCount; i++) {
executor.submit(...);
}
Thread.sleep(workMs); // rate limit submissions
}
Option 3: Bounded Queue + CallerRunsPolicy
BlockingQueue<Runnable> queue = new ArrayBlockingQueue<>(10000);
ThreadPoolExecutor executor = new ThreadPoolExecutor(
0, Integer.MAX_VALUE, 60, TimeUnit.SECONDS,
queue,
Thread.ofVirtual().factory(),
new ThreadPoolExecutor.CallerRunsPolicy() // throttles the submitter
);
Key Lessons
-
Virtual threads are not fire-and-forget –
submit()is non-blocking by design, which means you must manage the creation rate - Full GC releasing 0 bytes is a strong diagnostic signal – it almost always means thread explosion or a massive live object graph
- The virtual thread IDs in jstack don't lie – 150k new IDs in 1 second is a clear warning
- Backpressure is not optional – without it, any fast producer can overwhelm the JVM
Data Mapping – GC + Threads + Code
| GC/Thread Symptom | Root Cause in Code |
|---|---|
| Allocation rate 1,076 MB/s |
while no sleep + non-blocking submit()
|
| 324 Full GCs, 0 bytes released | All virtual threads alive, GC roots reachable |
| 150k new threads in 1 second | ~1M submit() calls per second |
| Throughput ~10% | 88% of time in Full GC |
Tool Note
Both analyses were performed using a JVM analysis tool I'm building – it parses GC logs, correlates with thread dumps, and extracts root cause patterns. The tool helped identify these issues in minutes rather than hours.
Top comments (0)