DEV Community

Cover image for I Started 10,000 Java Threads. My Laptop Barely Noticed.
S M Tahosin
S M Tahosin

Posted on

I Started 10,000 Java Threads. My Laptop Barely Noticed.

I expected the fans to spin.

I had just asked Java to start 10,000 tasks, give each task its own virtual
thread, and make every one wait for 100 milliseconds.

Instead, the program finished before I could move my hand away from Enter.

So I ran it again. Then three more times.

On my 12-logical-processor laptop, the median result looked like this:

Executor 10,000 waiting tasks
Fixed pool of 200 platform threads 5,116 ms
One virtual thread per task 378 ms

That is 13.5x faster completion after changing the executor, not the task.

Benchmark results comparing platform and virtual threads

This is not proof that virtual threads make Java code 13.5x faster.

It is proof that I had been thinking about threads incorrectly.

Let us rebuild that mental model from the inside.

First, Make a Prediction

Each task does this:

Thread.sleep(Duration.ofMillis(100));
Enter fullscreen mode Exit fullscreen mode

There are 10,000 tasks.

How long should the whole program take?

  • A: About 1,000 seconds, because 10,000 x 100 ms = 1,000 seconds
  • B: About 5 seconds, because 200 platform threads process the work in waves
  • C: Well under 1 second, because waiting virtual threads can step aside

All three answers can be correct. The executor decides which world you live
in.

The Old Mental Model

For most of Java's life, a Java thread was a thin wrapper around an operating
system thread.

That made threads useful, but expensive enough to treat as a limited resource.

If your server had a pool of 200 platform threads and all 200 were waiting for
a slow database, request 201 had to stand in line.

request -> platform thread -> OS thread -> wait
request -> platform thread -> OS thread -> wait
request ->       queue       ->          -> wait for a free thread
Enter fullscreen mode Exit fullscreen mode

The code was blocked, but the operating system thread assigned to it was still
occupied.

Virtual threads break that one-to-one relationship.

Platform threads compared with virtual threads

A virtual thread is still a real java.lang.Thread.

The difference is that it does not permanently own an OS thread. The JVM
schedules many virtual threads onto a smaller number of platform threads,
called carrier threads.

You can see the distinction directly:

Thread platform = Thread.ofPlatform().start(
        () -> System.out.println(Thread.currentThread().isVirtual())
);

Thread virtual = Thread.ofVirtual().start(
        () -> System.out.println(Thread.currentThread().isVirtual())
);

platform.join();
virtual.join();
Enter fullscreen mode Exit fullscreen mode

Output:

false
true
Enter fullscreen mode Exit fullscreen mode

Same Thread API. Different scheduling model.

What Happens When a Virtual Thread Waits?

Imagine a virtual thread running on carrier thread 3.

It calls a supported blocking operation, such as Thread.sleep() or blocking
network I/O.

The JVM can:

  1. Pause the virtual thread.
  2. Unmount it from carrier thread 3.
  3. Use carrier thread 3 to run other virtual threads.
  4. Remount the original virtual thread when its wait is over.

Timeline showing a virtual thread stepping aside while waiting

The virtual thread did not make the database, network, or timer faster.

It stopped wasting a scarce carrier thread while waiting.

That sentence is the whole feature:

Virtual threads make waiting cheap. They do not make work cheap.

The Experiment

Here is the important part of the benchmark.

private static final int TASKS = 10_000;
private static final Duration WAIT = Duration.ofMillis(100);

private static void run(ExecutorService executor) throws Exception {
    try (executor) {
        List<Future<Integer>> futures = new ArrayList<>(TASKS);

        for (int task = 0; task < TASKS; task++) {
            int taskId = task;

            futures.add(executor.submit(() -> {
                Thread.sleep(WAIT);
                return taskId;
            }));
        }

        for (Future<Integer> future : futures) {
            future.get();
        }
    }
}
Enter fullscreen mode Exit fullscreen mode

I ran the same method with two executors:

run(Executors.newFixedThreadPool(200));

run(Executors.newVirtualThreadPerTaskExecutor());
Enter fullscreen mode Exit fullscreen mode

The first executor lets at most 200 tasks wait at once.

The virtual-thread executor starts one virtual thread for every task. When the
tasks sleep, the JVM can unmount them and keep its carrier threads available.

That is why the fixed pool behaves roughly like this:

10,000 tasks / 200 threads = 50 waves
50 waves x 100 ms          = about 5 seconds
Enter fullscreen mode Exit fullscreen mode

The virtual-thread version does not need 50 waves. Almost every task can begin,
sleep, and get out of the carriers' way.

The measured medians from three runs were:

WAITING WORK
200 platform threads        5,116 ms
virtual thread per task       378 ms

CPU WORK
platform threads            2,387 ms
virtual threads             2,300 ms
Enter fullscreen mode Exit fullscreen mode

The waiting result changed dramatically.

The CPU result did not.

The Benchmark Trap

Virtual threads are not tiny turbo buttons.

To test that, I also submitted 48 CPU-heavy tasks that counted primes up to
1,000,000.

Both executors finished in roughly the same time because my laptop still had
only 12 logical processors.

You can create one million virtual threads.

You cannot create one million CPU cores.

Decision tree for choosing virtual threads

Good virtual-thread workloads spend meaningful time waiting:

  • HTTP requests
  • database queries
  • many file operations, after profiling
  • message queues
  • remote API calls
  • many independent sleep() or timer waits

Poor candidates spend most of their time calculating:

  • image processing
  • video encoding
  • compression
  • machine-learning inference
  • large in-memory transformations
  • number crunching

For CPU-bound work, use bounded parallelism near the amount of CPU your machine
can actually execute.

The Simplest Useful Rule

When tasks mostly wait:

try (var executor = Executors.newVirtualThreadPerTaskExecutor()) {
    Future<String> user = executor.submit(() -> loadUser());
    Future<List<Order>> orders = executor.submit(() -> loadOrders());

    renderProfile(user.get(), orders.get());
}
Enter fullscreen mode Exit fullscreen mode

This code is ordinary, blocking, and readable.

That is intentional.

For years, developers often had to choose between simple thread-per-request
code that did not scale and asynchronous code that scaled but split the
workflow across callbacks, futures, or reactive operators.

Virtual threads make the simple shape practical for many high-throughput
blocking applications.

They do not remove every concurrency problem. They remove one expensive
assumption: that every concurrent task needs its own OS thread.

Do Not Pool Virtual Threads

This feels wrong at first.

We learned to pool threads because platform threads were expensive. A pool
limited how many of those scarce threads existed.

Virtual threads are designed to be created per task.

So this is the normal pattern:

Executors.newVirtualThreadPerTaskExecutor();
Enter fullscreen mode Exit fullscreen mode

Not this:

a tiny pool of reusable virtual threads
Enter fullscreen mode Exit fullscreen mode

If you must limit access to something scarce, limit that thing.

Suppose a partner API permits only 20 concurrent requests:

Semaphore partnerApiSlots = new Semaphore(20);

String callPartnerApi() throws InterruptedException {
    partnerApiSlots.acquire();

    try {
        return makeBlockingHttpRequest();
    } finally {
        partnerApiSlots.release();
    }
}
Enter fullscreen mode Exit fullscreen mode

Many virtual threads passing through a semaphore before a partner API

The executor can still create a virtual thread per task.

The semaphore protects the actual bottleneck.

This separation is useful far beyond virtual threads:

Concurrency is how much work can be in progress. Capacity is how much work a
dependency can safely accept.

The Quiet ThreadLocal Trap

Virtual threads support ThreadLocal, so request context such as a user ID or
trace ID can continue to work.

The dangerous pattern is using ThreadLocal as a tiny object pool:

private static final ThreadLocal<ExpensiveClient> CLIENT =
        ThreadLocal.withInitial(ExpensiveClient::new);
Enter fullscreen mode Exit fullscreen mode

That may look efficient when 200 pooled platform threads reuse 200 clients.

With one virtual thread per task, it can quietly become thousands of expensive
clients that are barely reused.

Keep context in thread-local variables only when it truly belongs to the task.
Do not use them to cache heavy reusable objects per virtual thread.

You Can Observe Them

Virtual threads are invisible to the operating system because the OS sees
carrier threads, not every virtual thread.

The JDK understands them, though.

You can create a virtual-thread-aware dump with:

jcmd <pid> Thread.dump_to_file -format=json threads.json
Enter fullscreen mode Exit fullscreen mode

That distinction matters during debugging. An OS dashboard may show a modest
thread count while the JVM is managing thousands of virtual threads.

The right question is not only "how many threads exist?"

It is "what are those threads waiting for?"

One Outdated Warning

You may have read this advice:

Never block inside synchronized code when using virtual threads, because it
pins the carrier thread.

That warning mattered when virtual threads became final in Java 21.

Java 24 changed the implementation through
JEP 491. Virtual threads can now release their
carrier when blocking inside synchronized code in the normal case.

Pinning has not vanished completely. Native and foreign-function calls can
still pin a virtual thread.

But the blanket "virtual threads and synchronized do not mix" rule is
outdated on modern JDKs.

This is one reason I ran the experiment on Java 25 LTS instead of repeating an
old Java 21 checklist.

A Five-Minute Migration Checklist

Do not rewrite an application because virtual threads sound exciting.

Take one blocking workflow and inspect it.

  1. Confirm the workload waits. Look for database calls, HTTP calls, file access, queues, and sleeps.
  2. Replace the task executor. Try Executors.newVirtualThreadPerTaskExecutor().
  3. Keep downstream limits. Connection pools, API quotas, and rate limits still exist.
  4. Load test the real path. A sleep benchmark teaches the model, not your production capacity.
  5. Measure CPU and memory too. Cheap threads can still run expensive code or retain large objects.
  6. Check native integrations. Native calls are one of the remaining pinning cases.

The goal is not "use virtual threads everywhere."

The goal is "stop paying for idle OS threads where you do not need them."

The Mental Model I Am Keeping

Before this experiment, I thought:

More concurrent Java work requires a larger thread pool.

Now I think:

Waiting work wants cheap virtual threads. CPU work wants bounded
parallelism. Scarce dependencies want explicit limits.

That model is simple enough for a beginner and accurate enough to prevent a
surprising number of production mistakes.

The full runnable lab behind the numbers uses only the JDK. No framework, build
tool, or dependency is required.

Compile and run it with Java 25:

javac VirtualThreadsLab.java
java VirtualThreadsLab
Enter fullscreen mode Exit fullscreen mode

Open the complete runnable VirtualThreadsLab.java

Virtual threads became final in Java 21. Java 25 is not required for the basic
API, but it gives us the current LTS behavior, including the post-Java-24
improvements discussed above.

Sources

What should I put through this lab next: a database connection pool, 10,000
real HTTP calls, or a ThreadLocal-heavy application?

Top comments (0)