DEV Community

Cover image for Concurrency techniques applicable in Java 17+
Stefano Fago
Stefano Fago

Posted on

Concurrency techniques applicable in Java 17+

WARNING: This post was generated to support a human study using GenAI tools! The goal of this post is to support one of the most complex topics in programming languages; I hope it's helpful and useful!

A systematic, verified map of what you can actually use for concurrency in Java, from the 17 LTS baseline upward. Each entry states the JDK version where the feature became standard, what it is for, and — where useful — a link to an authoritative online resource.

Version convention. Java 17+ includes everything at or below 17. Rows tagged 21, 22,25 require that version or later; preview means the feature needs --enable-preview. Reference LTS releases: 17 (baseline), 21 (virtual threads), 25 (scoped values).

The single cross-cutting rule: do not share contended mutable state. Either confine/make it immutable, or shard it (writes), or publish it immutably (reads). Everything else is mechanics. Foundational reading: the Java Memory Model — happens-before, volatile, final field semantics, safe publication (Doug Lea, "Using JDK 9 Memory Order Modes"; JSR-133 FAQ).


The three levels

  1. High — task/thread model: executors, virtual threads, CompletableFuture, parallel streams, structured concurrency.
  2. Medium — the concurrent collections and synchronizers of java.util.concurrent.
  3. Low — the Java Memory Model, atomics, VarHandle, lock-free code, cache-line padding, off-heap memory.

1. Task execution model (high level)

Technique JDK What it is for
Thread (platform), Runnable/Callable 1.0/5 the base unit of execution
ExecutorService, ThreadPoolExecutor, Executors 5 the standard abstraction for submitting and managing tasks (submit/invokeAll, Future). A bounded platform-thread pool caps concurrency and bounds resources (connections, memory). Still central after virtual threads: you drive virtual threads through an ExecutorService too (see the virtual-threads row)
ScheduledThreadPoolExecutor 5 delayed / periodic tasks
ForkJoinPool + RecursiveTask/Action, work-stealing, commonPool 7 divide-and-conquer, CPU-bound parallelism
Future / FutureTask 5 handle to an async result; FutureTask wraps a Callable/Runnable as a cancellable, one-shot computation
CompletionService / ExecutorCompletionService 5 consume task results as they complete, not in submission order
CompletableFuture / CompletionStage 8 asynchronous composition, non-blocking pipelines
Parallel streams + Spliterator 8 declarative data parallelism (runs on the common ForkJoinPool)
Virtual Threads (Executors.newVirtualThreadPerTaskExecutor) 21 millions of lightweight threads for I/O-bound work; not for CPU-bound loops; watch out for pinning inside synchronized/native frames. Do not pool them: the per-task executor creates a fresh virtual thread per submitted task (no thread pool)
Structured Concurrency (StructuredTaskScope) preview (needs JDK 21+; latest JEP 505 in 25) treat several concurrent subtasks as one unit: errors and cancellation propagate, no thread leaks
ThreadLocal / InheritableThreadLocal 1.2 per-thread (per-task) context — request IDs, transaction/security context, reusable buffers. Works with virtual threads, but millions of threads holding large ThreadLocal values can exhaust memory; for immutable per-task context, prefer Scoped Values
Scoped Values (ScopedValue) 25 (JEP 506; preview since 21) immutable, bounded-lifetime alternative to ThreadLocal; the value is shared with child tasks forked in a StructuredTaskScope. Designed for virtual threads and structured concurrency

Key references:
JEP 444 — Virtual Threads and the Oracle Virtual Threads guide;
JEP 505 — Structured Concurrency;
JEP 506 — Scoped Values.

Guidance. ExecutorService is the common way to submit and manage tasks — and it stays central after virtual threads, because it is the abstraction you use for both kinds of thread:

  • CPU-bound / divisible work → a bounded platform-thread pool (ThreadPoolExecutor, sized to the cores) or ForkJoinPool / parallel streams for divide-and-conquer.
  • Bounding a scarce resource (DB connections, sockets, memory) → a bounded platform-thread pool as an admission control/throttle.
  • Timed or periodic workScheduledThreadPoolExecutor.
  • Large numbers of blocking I/O tasksvirtual threads, driven through Executors.newVirtualThreadPerTaskExecutor() (an ExecutorService that makes one new virtual thread per task — you do not pool virtual threads), ideally with structured concurrency.
  • Composing asynchronous stagesCompletableFuture (or reactive Flow when you need back-pressure).

So: virtual threads change which threads an executor runs, not the fact that you still submit work to an ExecutorService. Pooling remains the right model for platform threads (to cap CPU parallelism and bound resources); it is the wrong model for virtual threads (create one per task). See the Oracle Virtual Threads guide
and Executors.

2. Coordination/synchronizers (medium level)

Technique JDK When to use
synchronized + wait/notify (intrinsic monitors) 1.0 simple mutual exclusion
ReentrantLock + Condition 5 explicit lock, optional fairness, multiple wait sets
ReentrantReadWriteLock 5 many readers, few writers (separate read/write locks)
StampedLock (optimistic reads / seqlock) 8 read-heavy state; readers usually take no lock and validate a version stamp, falling back to a read lock if a writer intervened. Not reentrant; in an optimistic section copy fields to locals before validate()
Semaphore 5 permits / bounding a resource
CountDownLatch 5 wait for N events, one-shot
CyclicBarrier 5 reusable N-party barrier
Phaser 7 dynamic, phased barrier (parties can register/deregister across phases)
Exchanger 5 two-party rendezvous that swaps objects
LockSupport (park/unpark) 5 the low-level blocking primitive underneath all of the above; rarely called directly
AbstractQueuedSynchronizer (AQS) / AbstractQueuedLongSynchronizer 5 the framework for building your own locks/synchronizers — it powers ReentrantLock, Semaphore, CountDownLatch, ReentrantReadWriteLock, ...

References: StampedLock, Phaser, Exchanger.

3. Concurrent collections (medium level)

Structure JDK Characteristic
ConcurrentHashMap (+ compute/merge, bulk reductions) 5/8 scalable map, internal sharding, lock-free reads
ConcurrentSkipListMap/Set 6 sorted concurrent map/set
CopyOnWriteArrayList/Set 5 read-mostly; each write copies the backing array and publishes it; iterators see a stable snapshot (no ConcurrentModificationException, but not live updates)
ConcurrentLinkedQueue/Deque 5/7 non-blocking queue/deque; ConcurrentLinkedQueue uses the Michael–Scott algorithm (ConcurrentLinkedDeque uses a related but more complex scheme)
ArrayBlockingQueue 5 bounded, single lock
LinkedBlockingQueue 5 optionally bounded, two locks (put/take)
LinkedBlockingDeque (BlockingDeque) 6 optionally bounded blocking deque (insert/take at both ends)
PriorityBlockingQueue 5 unbounded, priority-ordered
DelayQueue 5 elements become available after a delay
SynchronousQueue 5 zero-capacity direct hand-off (dual data structure)
LinkedTransferQueue (TransferQueue) 7 a producer can block until a consumer takes the element (transfer)

Reference: java.util.concurrent package overview. When many threads access a map, ConcurrentHashMap is normally preferable to a synchronized HashMap, and ConcurrentSkipListMap to a synchronized TreeMap.

4. Atomics and lock-free (low level)

Technique JDK Use
AtomicInteger/Long/Boolean/Reference, Atomic*Array 5 CAS on a single cell (or array element)
compareAndSet / compareAndExchange / weakCompareAndSet 5/9 lock-free primitives (weak* may fail spuriously; use in loops)
LongAdder/DoubleAdder/LongAccumulator/DoubleAccumulator (Striped64) 8 high-contention counters: a base plus an array of @Contended cells, each thread hashed (by its ThreadLocalRandom probe) to a different cell, so increments hit different cache lines; sum()/get() adds base + cells and is not an atomic snapshot
AtomicStampedReference/AtomicMarkableReference 5 mitigate the ABA problem by pairing a stamp/mark with the reference
AtomicIntegerFieldUpdater/LongFieldUpdater/ReferenceFieldUpdater 5 reflective atomic updates to volatile fields — legacy, largely superseded by VarHandle
ThreadLocalRandom 7 per-thread pseudo-random generator that avoids the contention of a shared Random
VarHandle (plain/opaque / release-acquire / volatile modes + fences) 9 fine-grained memory ordering; the public, safe replacement for sun.misc.Unsafe. Prefer the weakest mode that is still correct (e.g. a single-writer cursor needs only release/acquire, not volatile)
@Contended + manual cache-line padding 8 (internal API; sun.misc.Contended in 8, moved to jdk.internal.vm.annotation in 9) isolate hot, concurrently-written fields onto their own 64-byte cache line to avoid false sharing. On 17+ it is jdk.internal.vm.annotation.Contended and needs --add-exports java.base/jdk.internal.vm.annotation=ALL-UNNAMED plus -XX:-RestrictContended; portable code often pads manually via inheritance instead
Off-heap MemorySegment/Arena (Foreign Function & Memory API) 22 flat, contiguous, GC-free layout with explicit alignment; atomic access requires aligned addresses

References: VarHandle and JEP 193;LongAdder and the Striped64 source; JEP 454 — Foreign Function & Memory API; Baeldung — False Sharing and @Contended.

5. Architectural patterns (the real "power tools")

  • Thread confinement/immutability / safe publication — avoid shared mutable state altogether: the cheapest concurrency is no shared writes.
  • Sharding/striping (per-writer decomposition) — instead of N writers contending on one structure (whose shared cursor ping-pongs a cache line between cores via the coherence protocol), give each writer its own single-writer structure and reconcile on the read side. LongAdder is this idea for a counter; per-producer SPSC ring buffers are it for queues. Background: Dmitry Vyukov's lock-free algorithms, JCTools, LMAX Disruptor.
  • Read-Copy-Update (RCU) / Copy-On-Write / snapshot — publish immutable versions via an atomic reference swap; readers are lock-free; old versions are reclaimed by the GC (in Java, the garbage collector subsumes RCU's hard part, safe memory reclamation). See LWN, "What is RCU, Fundamentally?".
  • Seqlock / optimistic reading — read without a lock, then validate a version counter and retry if a writer intervened (StampedLock.tryOptimisticRead). Great for a few fields read very often. Origin: Linux sequence locks.
  • Left-Right — keep two copies of a structure: readers always read one copy wait-free, while the writer updates the other, flips readers over, drains the old readers, then updates the first. Starvation-free writers, no GC dependency. See Ramalhete & Correia, "Left-Right" and this clear write-up.
  • Elimination + backoff — spread contention across multiple slots chosen by a per-thread probe (an "arena"), as in Exchanger and elimination-backoff stacks. See Scherer, Lea & Scott, "A Scalable Elimination-based Exchange Channel".
  • Producer/consumer, ring buffers, pipelines — bounded SPSC/MPSC queues and the Disruptor give back-pressure and low latency.
  • Reactive streams / back-pressurejava.util.concurrent.Flow (Publisher/Subscriber/Processor) and SubmissionPublisher (JDK 9) standardize asynchronous streams with flow control.
  • Safe Memory Reclamation (SMR) — for lock-free structures that recycle nodes off-heap (where the GC can't help), use Epoch-Based Reclamation or hazard pointers. See Hazard Pointers: Safe Memory Reclamation for Lock-Free Objects (Michael, 2004).

6. Testing and verification (not optional)

  • JMH (Java Microbenchmark Harness) — reliable performance measurement (forks, warmup, dead-code/constant-folding avoidance).
  • JOL (Java Object Layout) — inspect the real in-memory layout of objects (padding, false sharing).
  • jcstress (Java Concurrency Stress) — tests correctness by exploring thread interleavings; the only serious way to gain confidence that a lock-free algorithm is correct (stress tests and benchmarks do not prove correctness).

7. Decision table — "which one do I use?"

Problem First choice Alternatives / notes
Many I/O-bound tasks Virtual Threads (21) via Executors.newVirtualThreadPerTaskExecutor() + structured concurrency a bounded platform-thread pool if < 21
CPU-bound divisible work ForkJoinPool / parallel streams, or a bounded ThreadPoolExecutor size pools to the core count
Bound a scarce resource/throttle a bounded ThreadPoolExecutor (or Semaphore) don't pool virtual threads for this — bound with a semaphore instead
Compose async calls CompletableFuture reactive Flow when you need back-pressure
Per-task context (request id, security, txn) ScopedValue (25) for immutable context; otherwise ThreadLocal avoid large ThreadLocal values when running millions of virtual threads
Heavily contended counter LongAdder AtomicLong if contention is low or you need an exact snapshot
General concurrent map ConcurrentHashMap ConcurrentSkipListMap if you need ordering
Read-mostly config/table Copy-On-Write / RCU (AtomicReference of an immutable value) or CopyOnWriteArrayList Left-Right or seqlock for specific cases
A few fields read extremely often StampedLock optimistic read (seqlock) AtomicReference of an immutable record
Many producers → one consumer per-writer SPSC decomposition (one ring per producer) a shared MPSC queue / JCTools if you accept the contention
1↔1 hand-off with a swap Exchanger SynchronousQueue for hand-off without a swap
Producer/consumer queue ArrayBlockingQueue / LinkedBlockingQueue LinkedTransferQueue for synchronous hand-off
Squeezing the last 10% of latency VarHandle + padding + off-heap always measure with JMH, verify with jcstress

8. What is no longer the right choice

  • Vector, Hashtable, Collections.synchronizedXxx — a single global lock; superseded by the concurrent collections above.
  • sun.misc.Unsafe — replaced by VarHandle (memory ordering) and the Foreign Function & Memory API (off-heap).
  • Thread.stop/suspend/resume — deprecated/removed; unsafe by design.
  • finalize() for cleanup — use Arena, java.lang.ref.Cleaner, or try-with-resources.

Bibliography (complete, verified)

Every online resource cited in this document, deduplicated and grouped by topic. All URLs were checked.

Specifications & JEPs

Official API documentation & OpenJDK sources

Java Memory Model

Lock-free, queues & mechanical sympathy

Read-mostly: RCU, Left-Right, safe memory reclamation

Elimination/rendezvous (Exchanger family)

Tooling (measure & verify)

Books

  • Brian Goetz et al., Java Concurrency in Practice (Addison-Wesley, 2006) — the standard text on the fundamentals of the Java memory model and java.util.concurrent.

Top comments (0)