TL;DR
This little game shows in practice that virtual threads are lightweight and you can create much more virtual threads than physical ones. And it is easy to switch between the two.
Intro
Virtual threads introduced in Java 19 are production-ready in Java 21 now. So, let's take a closer look and compare them to standard threads in terms of computation-intensive tasks: we will create The Game of Life, the Universe, and Everything.
I believe, there are a lot of posts on Java virtual threads out there. So, this post is unlikely will bring any revolutionary ideas or insights. Both the source code and this post are written just to have some fun, to see virtual threads in action, and to check resources consumption compared to physical threads.
The full source code is available on GitHub.
Game of Life (the Problem)
Conway's Game of Life is a zero-player game with simple rules:
- Any live cell with two or three live neighbours survives.
- Any dead cell with three live neighbours becomes a live cell.
- All other live cells die in the next generation. Similarly, all other dead cells stay dead.
So, calculating the cell state in the next generation is fast and easy.
But real life is much more complex, and what if the cell state calculation takes time? Also, the Universe is big enough to hold a lot of cells.
And then, the next-generation calculation could take ages.
The Game of Life, the Universe, and Everything
So, let's slightly adjust the rules to real life. The three rules mentioned above stay the same. However, one limitation is added on top of them: cell calculation takes time. It can be as simple as pausing for a configured amount of time or as complex as treating a cell as a Universe on its own and calculating this whole Universe first.
We will call it The Game of Life, the Universe, and Everything.
Okay, we need to build the game? No problem, we will use Spring Boot and Swing!
Main Components
We will need the following main components.
Cell
class that holds the current cell state and the future state that is updated during calculation:
public class Cell {
private boolean isCurrentlyAlive;
private boolean isFutureAlive;
}
LifeCalculator
class that holds the cells and does the cell calculations according to the specified calculation strategy:
public abstract class LifeCalculator extends SwingWorker<Cell[][], Void> {
private final Cell[][] board;
private final CalculationStrategy calculationStrategy;
@Override
protected Cell[][] doInBackground() throws InterruptedException {
// ...
for (int i = 0; i < rows; i++) {
for (int j = 0; j < columns; j++) {
calculationStrategy.calculateCell(...);
}
}
// ...
return board;
}
// ...
}
CalculationStrategy
interface that defines a cell calculation mechanism:
public interface CalculationStrategy {
CompletableFuture<Void> calculateCell(Runnable runnable);
}
SingleThreadCalculationStrategy
class that implements CalculationStrategy
and simply calculates the cell state in the same thread:
public class SingleThreadCalculationStrategy implements CalculationStrategy {
@Override
public CompletableFuture<Void> calculateCell(Runnable runnable) {
runnable.run();
return CompletableFuture.completedFuture(null);
}
}
ThreadPoolCalculationStrategy
class that implements CalculationStrategy
and calculates every cell in a separate thread:
public class ThreadPoolCalculationStrategy implements CalculationStrategy {
private final ExecutorService executor;
public ThreadPoolCalculationStrategy(ExecutorService executor) {
this.executor = executor;
}
@Override
public CompletableFuture<Void> calculateCell(Runnable runnable) {
return CompletableFuture.supplyAsync(() -> {
runnable.run();
return null;
}, executor);
}
}
Here is the diagram of the classes mentioned above:
As our limitation requires that cell calculation takes time, let's add a simple Thread.sleep
during cell calculation:
calculationStrategy.calculateCell(() -> {
updateFutureCellStatus(k, m, countLiveNeighbours(k, m));
try {
Thread.sleep(properties.calculation().cellDelay());
} catch (InterruptedException e) {
// ...
}
})
The cell delay is configured via spring profiles and application-xxx.yml
files. (see below)
Board Update Time
We also want to measure board update time to be able to compare different calculation strategies:
public abstract class LifeCalculator extends SwingWorker<Cell[][], Void> {
// ...
@Override
protected Cell[][] doInBackground() throws InterruptedException {
long start = System.nanoTime();
// calculate the board
boardUpdateTimeLogger.saveBoardUpdateMillis(Duration.ofNanos(System.nanoTime() - start).toMillis());
// ...
return board;
}
// ...
}
And then let's simply log the min, average, and max board update time every 5 mins:
@Component
public class BoardUpdateTimeLogger {
private static final Logger LOG = LoggerFactory.getLogger(BoardUpdateTimeLogger.class);
private volatile UpdatesHolder updatesHolder = new UpdatesHolder(Long.MAX_VALUE, Long.MIN_VALUE, 0, 0);
public void saveBoardUpdateMillis(long millis) {
long minMillis = ...;
long maxMillis = ...;
long totalMillis = ...;
long updatesNumber = ...;
updatesHolder = new UpdatesHolder(minMillis, maxMillis, totalMillis, updatesNumber);
}
@Scheduled(fixedDelay = 5, timeUnit = TimeUnit.MINUTES)
public void logBoardUpdateTime() {
var holder = updatesHolder;
if (holder.updatesNumber() == 0) {
return;
}
if (LOG.isInfoEnabled()) {
LOG.info("Board updated {} time(s), update time: min={} ms, avg={} ms, max={} ms",
holder.updatesNumber(),
holder.minMillis(),
holder.totalMillis() / holder.updatesNumber(),
holder.maxMillis());
}
}
}
UI
We are going to run the game on different board sizes:
- small: 30 x 55 = 1650 cells
- big: 320 x 840 = 268800 cells
Once the application is started, the board is randomly filled with dead and live cells.
Here is what the application UI looks like. (Do you like this interface from the 2000s? I'm not a UI genius as you can see)
Small board:
Big board:
Spring Profiles
Now, we want to run the game with several scenarios:
- by the calculation strategy:
- single thread
- thread pool:
- physical threads
- virtual threads
- by the board size:
- small
- big
- by cell calculation delay:
- no delay
- with delay
Let's create several spring profiles and use their combination to run different scenarios:
# | Active Profiles | Calculation Strategy | Board Size | Cell Calculation Delay |
---|---|---|---|---|
1 | single-thread, small |
single thread | small | 0 |
2 | single-thread |
single thread | big | 0 |
3 | single-thread, cell-delay |
single thread | big | >0 |
4 | physical-threads, small |
thread pool with physical threads | small | 0 |
5 | physical-threads |
thread pool with physical threads | big | 0 |
6 | physical-threads, cell-delay |
thread pool with physical threads | big | >0 |
7 | virtual-threads, small |
thread pool with virtual threads | small | 0 |
8 | virtual-threads |
thread pool with virtual threads | big | 0 |
9 | virtual-threads, cell-delay |
thread pool with virtual threads | big | >0 |
Run 'Em All
Let's run each of the scenarios above.
I ran the application on a MacBook Pro with 2,6 GHz 6-Core Intel Core i7 and 16 GB RAM.
When the application gets up, simply press the Start button, sit back, and observe.
All the board update time and resources consumption results are aggregated in the next section for convenience.
1. Single thread, small board, no cell calculation delay
There are only 1650 cells to calculate, so the calculation of the board is fast and does not take many resources.
The board is updated 2769 times in 5 mins, update time statistics: min=1 ms, avg=2 ms, max=89 ms.
Approximate resources consumption: 58MB of heap, 0 % CPU, and 32 live threads.
2. Single thread, big board, no cell calculation delay
On the big board, we have 268800 cells to calculate which is 163 times more than on the small board. And now it consumes more CPU and memory. Also, the whole board's calculation speed decreased.
Now the board is updated 815 times in 5 mins, update time statistics: min=196 ms, avg=221 ms, max=452 ms.
Approximate resources consumption: 300MB of heap, 10 % CPU, and 31 live threads.
3. Single thread, big board, with cell calculation delay
Now, with a one-second cell calculation delay and a single thread calculation, nothing happens from the user's perspective when they press the Start button. This is because the board calculation takes
268800 cells * 1 sec ~ 75 hours
Since the cell calculation delay is simply Thread.sleep
, resources consumption is low: 98MB of heap, 0% CPU, and 23 live threads.
4. Physical threads, small board, no cell calculation delay
Okay, let's try the calculation strategy that calculates every cell in a separate physical thread.
The Board is updated 2071 times in 5 mins, update time statistics: min=6 ms, avg=43 ms, max=88 ms.
The approximate resources consumption is 69MB of heap, 10% CPU, and 100-1000 live threads.
As we can see, the number of live threads is far from stable. It may vary depending on the machine where the application is run and the other processes that take resources.
5. Physical threads, big board, no cell calculation delay
On the big board, it again takes more CPU and RAM.
The animation is way slower now than both with the single thread on the big board (2) and with the physical threads on the small board (4). So, from the user's perspective this is a loss.
The board is only updated 40 times in 5 mins, update time statistics: min=951 ms, avg=6686 ms, max=8559 ms.
One reason is that while the calculation itself is fast, a lot of threads are running and switching context between them takes time.
Approximate resources consumption: 285MB of heap, 30% CPU, and 130-250 live threads.
Note, that calculation of 268800 cells does not require the same number of threads. Since the calculation is fast, threads are quickly returned to the pool and reused. So, something around 200 threads or even less works here.
6. Physical threads, big board, with cell calculation delay
Adding cell calculation delay to the physical thread calculation strategy causes problems. Since the calculation is slow now, when you press the Start button, a new thread is started in the pool for every cell. And then the application fails with an exception:
[17.873s][warning][os,thread] Failed to start thread "Unknown thread" - pthread_create failed (EAGAIN) for attributes: stacksize: 1024k, guardsize: 4k, detached.
[17.873s][warning][os,thread] Failed to start the native thread for java.lang.Thread "pool-2-thread-4042"
java.util.concurrent.ExecutionException: java.lang.OutOfMemoryError: unable to create native thread: possibly out of memory or process/resource limits reached
I admit that most likely this limit could be increased. However, 10 minutes of googling and trying different approaches did not bring results, so I just stopped digging.
And here is resources consumption that we have here, which is not really correct since we have OutOfMemoryError
thrown: 390MB of heap, 2% CPU, and 4065 live threads.
7. Virtual threads, small board, no cell calculation delay
Now let's run the game with a pool of virtual threads instead of physical ones.
The number of threads is stable now compared to the similar physical threads scenario (4). However, the animation is almost at the same speed, so both are similar from the user's perspective.
The board is updated 2784 times in 5 mins, update time statistics: min=1 ms, avg=2 ms, max=93 ms.
The resources consumption is pretty low: 65MB of heap, 2% CPU, and 43 live threads.
8. Virtual threads, big board, no cell calculation delay
With the big board the picture changes. Now we have a high CPU and memory consumption compared to the scenarios we ran before: 250MB-2GB of heap and 55% CPU. The number of threads is still stable: 43 live threads.
The animation is way faster now than with physical threads and the big board (5).
The board updated 685 times in 5 mins, update time statistics: min=242 ms, avg=289 ms, max=431 ms.
9. Virtual threads, big board, with cell calculation delay
By adding the cell calculation delay, we decrease the resources consumption a bit, but the overall picture stays the same: 500-2GB of heap, 30% CPU, and 43 live threads.
Note that this is the only configuration that can cope with the cell calculation delay. So, we cannot actually compare the animation speed to others.
The board is updated 164 times in 5 mins, update time statistics: min=1605 ms, avg=1662 ms, max=2240 ms.
Also note that ~1000ms in this statistic is actually a calculation delay.
Results
Now, let's aggregate results and see if there is something useful.
Resources Consumption
Here are the results in terms of resources consumption aggregated in a table:
Calculation Strategy | Board Size | Cell Calc. Delay | Heap, MB | CPU, % | Live threads |
---|---|---|---|---|---|
single thread | small | 0 | 58 | 0 | 32 |
single thread | big | 0 | 300 | 10 | 31 |
single thread | big | >0 | 98 | 0 | 23 |
thread pool with physical threads | small | 0 | 69 | 10 | 100-1000 |
thread pool with physical threads | big | 0 | 285 | 30 | 130-250 |
thread pool with physical threads | big | >0 | |||
thread pool with virtual threads | small | 0 | 65 | 2 | 43 |
thread pool with virtual threads | big | 0 | 250-2000 | 55 | 43 |
thread pool with virtual threads | big | >0 | 500-2000 | 30 | 43 |
So, what we can see here?
- CPU usage for the small board is not high and the lowest for the single thread calculation strategy.
- CPU usage for the big board with no cell calculation delay is higher than for other scenarios, which is expected.
- Heap size is higher for big board scenarios, which is again expected.
- Number of live threads is stable for the single thread and virtual threads scenarios.
- Number of live threads is not stable for physical threads scenarios, it is highly dependent on other workloads running on a machine.
- There is a relatively low limit on how many physical threads a java process can create.
Animation Speed
Now, let's compare the board update times which affects the animation speed:
Calculation Strategy | Board Size | Cell Calc. Delay | Min, ms | Avg, ms | Max, ms |
---|---|---|---|---|---|
single thread | small | 0 | 1 | 2 | 89 |
single thread | big | 0 | 196 | 221 | 452 |
single thread | big | >0 | - | - | - |
thread pool with physical threads | small | 0 | 6 | 43 | 88 |
thread pool with physical threads | big | 0 | 951 | 6686 | 8559 |
thread pool with physical threads | big | >0 | - | - | - |
thread pool with virtual threads | small | 0 | 1 | 2 | 93 |
thread pool with virtual threads | big | 0 | 242 | 289 | 431 |
thread pool with virtual threads | big | >0 | 1605 | 1662 | 2240 |
Here is what we can see from the user (i.e., animation speed) perspective:
- Calculation of the small board is very fast for all strategies, with physical threads being 20 times slower on average though.
- Calculation of the big board with no cell calculation delay is 100+ times slower than the small board. However, it is still pretty fast for the single thread and virtual threads.
- Calculation of the big board with no cell calculation delay for physical threads is really slow - it is 30 times slower than for the other strategies.
- Thread pool with virtual threads is the only strategy that can cope with the big board and cell calculation delay.
Conclusion
This little game shows in practice that you can create much more virtual threads than physical ones. Furthermore, virtual threads are more lightweight.
Also, migration from physical threads to virtual ones is as easy as just replacing the executor service.
And remember: do not run the game at the beginning of the working day, it is so bingeworthy.
Dream your code, code your dream.
Top comments (0)