Java Virtual Threads in 2025: Scalable I/O Without Async Hell (and the Real Limits)

#java #projectloom #concurrency #virtualthreads

Note This article uses Java 21 as the baseline. Virtual threads are stable in Java 21 (JEP 444), so no preview flags are needed here.

Originally published on engnotes.dev: https://engnotes.dev/blog/project-loom/virtual-threads-revolution-part-1
This is a shortened version with the same core code and takeaways.

If you have worked on Java backends long enough, you have probably seen the same problem show up again and again. The business logic is not especially hard. The traffic is. Then the thread math starts. How many threads, how much memory, how much waiting, how much tuning before the whole thing starts feeling awkward.

That is the pain virtual threads address.

The Old Problem

Platform threads are expensive. Each one uses real OS resources and carries a meaningful memory cost. For blocking I/O workloads, that puts a ceiling on concurrency much earlier than most teams want.

This is the usual shape:

public class PlatformThreadPoolServer {
    private static final int PORT = 8080;
    private static final int THREAD_POOL_SIZE = 20;
    private static final long BLOCKING_SIMULATION_TIME = 200;

    public static void main(String[] args) throws IOException {
        ExecutorService threadPoolExecutor = Executors.newFixedThreadPool(THREAD_POOL_SIZE);
        HttpServer server = HttpServer.create(new InetSocketAddress(8080), 0);

        server.createContext("/api", exchange -> {
            threadPoolExecutor.submit(() -> {
                try {
                    Thread.sleep(BLOCKING_SIMULATION_TIME);
                    String response = "Platform Thread Ok\n";
                    exchange.sendResponseHeaders(200, response.length());
                    exchange.getResponseBody().write(response.getBytes());
                    exchange.close();
                } catch (Exception e) {
                    e.printStackTrace();
                }
            });
        });

        server.setExecutor(null);
        server.start();
    }
}

This works, until too many requests block at once. Then threads stay tied up, queues grow, and latency starts getting ugly.

Most of us worked around that with bigger pools, more tuning, or async/reactive code. Sometimes that was the right call. A lot of the time it was just the cost of an old model.

What Virtual Threads Change

The big change is not new syntax. The code still looks like normal Java.

public class VirtualThreadPoolServer {
    private static final int PORT = 8080;
    private static final long BLOCKING_SIMULATION_TIME = 200;

    public static void main(String[] args) throws IOException {
        ExecutorService loomExecutor = Executors.newVirtualThreadPerTaskExecutor();
        HttpServer server = HttpServer.create(new InetSocketAddress(8080), 0);
        server.createContext("/api", exchange -> {
            loomExecutor.submit(() -> {
                try {
                    Thread.sleep(BLOCKING_SIMULATION_TIME);
                    String response = "Virtual Thread Ok\n";
                    exchange.sendResponseHeaders(200, response.length());
                    exchange.getResponseBody().write(response.getBytes());
                    exchange.close();
                } catch (InterruptedException e) {
                    Thread.currentThread().interrupt();
                } catch (Exception e) {
                    e.printStackTrace();
                }
            });
        });

        server.setExecutor(null);
        server.start();
    }
}

That is why Loom feels so practical. The control flow stays direct. The code stays readable. But the runtime behavior changes in a way that matters a lot for I/O-heavy systems.

The Useful Mental Model

Virtual threads use M:N scheduling. Many virtual threads run on a smaller set of carrier threads, which are regular platform threads.

The important bit is this:

when a virtual thread blocks on I/O, it parks and releases its carrier thread. That carrier thread can immediately run something else.

That is the part that changes the tradeoff. Blocking no longer means burning an OS thread while you wait.

Where They Help

Virtual threads are a strong fit for:

web services handling lots of concurrent requests
database calls, file operations, and network I/O
task-oriented backend flows where blocking is the natural model
systems where async complexity has grown faster than the business logic

Where They Do Not Magically Help

Virtual threads improve request concurrency. They do not remove system limits.

database pools still cap throughput
external APIs still rate limit you
socket and file descriptor limits still matter
CPU-bound work usually does not get the same benefit
pinning can still hurt if you keep blocking work inside synchronized hot paths or long native calls

That part is worth being honest about. Loom fixes one painful layer of the problem. It does not make bottlenecks disappear.

Key Takeaway

What I like about virtual threads is that they make Java feel a bit less like it is fighting the shape of backend work. For a lot of I/O-heavy services, the straightforward blocking model becomes practical again.

It is just a much better default than what we had before.

Full article with all diagrams, larger code examples, runnable repo, and live NoteSensei chat (ask me anything about virtual threads), check understanding by playing quiz:

https://engnotes.dev/blog/project-loom/virtual-threads-revolution-part-1