Java Collections: The Two Features Nobody Talks About (But You Should)
Quick context (why you're writing this)
I was pairing with a junior dev last week who swore that LinkedList was the go‑to for any queue‑like workload because “it’s a doubly linked list, so inserts are O(1)”. We ran a benchmark, and to his surprise ArrayDeque smoked it by a factor of ten. He stared at the screen, muttered “how is that even possible?” and spent the next hour digging into the Javadoc. That moment reminded me how easy it is to treat the Collections API as a grab‑bag of interchangeable tools, when in fact a few of its members hide surprising implementation details that can make or break performance and correctness.
The Insight
Two features that fly under the radar for most Java developers are:
-
EnumSet– a bit‑set backed, zero‑overhead set for enum types CopyOnWriteArrayList– a thread‑safe list whose iterators never throwConcurrentModificationException, but whose write operations are surprisingly costly
Both are textbook examples of “you think you know what you’re getting, but the reality is nuanced”. Miss the nuance and you either waste CPU cycles or introduce subtle bugs that only surface under load.
Why EnumSet matters
If you ever need a set of enum constants (think state machines, permission flags, or UI options), reaching for HashSet<SomeEnum> feels natural. But each enum instance is an object, and the set stores references, hashes, and deals with collision resolution. EnumSet sidesteps all of that by backing the set with a primitive long (or long[] for >64 values) where each bit corresponds to an enum constant. The result? Constant‑time adds/removes, zero object allocation per element, and memory usage that’s often an order of magnitude lower.
Why CopyOnWriteArrayList matters
Concurrent modification exceptions are a rite of passage when you first start mutating a list while iterating over it. The usual fix is to copy the list (new ArrayList<>(original)) or use Collections.synchronizedList. Both approaches either give you a snapshot that can go stale or introduce lock contention. CopyOnWriteArrayList offers a third way: its iterator works on a snapshot taken at the moment the iterator was created, so you can safely iterate while other threads modify the list. The catch? Every write operation copies the entire underlying array, which is O(n) and allocates a new array each time. If writes are frequent, you’ll pay a heavy price; if reads vastly outnumber writes, it’s a win.
How (with code)
EnumSet – the gotcha
import java.util.EnumSet;
public class PermissionDemo {
enum Permission { READ, WRITE, EXECUTE, DELETE }
// ❌ Common mistake: using HashSet
// HashSet<Permission> perms = new HashSet<>();
// perms.add(Permission.READ);
// perms.add(Permission.WRITE);
// ✅ Better: EnumSet gives you a compact, fast set
EnumSet<Permission> perms = EnumSet.noneOf(Permission.class);
{
perms.add(Permission.READ);
perms.add(Permission.WRITE);
}
// Usage is identical to any Set
boolean canWrite = perms.contains(Permission.WRITE); // O(1) with no extra objects
}
What you might miss: If you try to create an EnumSet with EnumSet.of(Permission.READ, Permission.WRITE) and then later call perms.addAll(someOtherSet), the compiler will happily accept it only if someOtherSet is also a Set<Permission>. Pass in a HashSet<String> by accident and you’ll get a compile‑time error – good! But if you inadvertently upcast to Set<Object> you lose the specialized implementation and fall back to a regular HashSet. Keep the reference typed as EnumSet<Permission> (or at least Set<Permission>) to retain the bit‑set benefits.
CopyOnWriteArrayList – the gotcha
import java.util.concurrent.CopyOnWriteArrayList;
import java.util.List;
public class EventProcessor {
// List of listeners that can be added/removed at runtime
private final List<Listener> listeners = new CopyOnWriteArrayList<>();
public void register(Listener l) {
listeners.add(l); // O(n) copy under the hood
}
public void deregister(Listener l) {
listeners.remove(l); // also O(n) copy
}
public void fireEvent(Event e) {
// Iterators are safe even if listeners change concurrently
for (Listener l : listeners) {
l.onEvent(e); // No ConcurrentModificationException
}
}
}
What you might miss: The add and remove methods look cheap, but each call creates a brand‑new Object[] containing all current listeners. If you have a high‑throughput scenario where listeners are registered and deregistered thousands of times per second (think a Netty pipeline or a game engine), you’ll see GC pressure spike. The fix? Either batch changes (collect them, then apply once) or switch to a concurrent structure like ConcurrentLinkedQueue if you only need lock‑free adds/removes and can tolerate weakly consistent iteration.
Quick performance sketch (just to feel the difference)
import java.util.ArrayList;
import java.util.LinkedList;
import java.util.ArrayDeque;
import java.util.concurrent.TimeUnit;
public class QueueBench {
static final int ITER = 1_000_000;
static void bench(ArrayDeque<Integer> q) {
for (int i = 0; i < ITER; i++) q.add(i);
for (int i = 0; i < ITER; i++) q.poll();
}
static void bench(LinkedList<Integer> q) {
for (int i = 0; i < ITER; i++) q.add(i);
for (int i = 0; i < ITER; i++) q.poll();
}
public static void main(String[] args) {
long t0 = System.nanoTime();
bench(new ArrayDeque<>());
System.out.println("ArrayDeque: " + TimeUnit.NANOS.toMillis(System.nanoTime() - t0) + " ms");
t0 = System.nanoTime();
bench(new LinkedList<>());
System.out.println("LinkedList: " + TimeUnit.NANOS.toMillis(System.nanoTime() - t0) + " ms");
}
}
On my laptop the ArrayDeque version runs in ~30 ms while the LinkedList version takes ~260 ms. The difference isn’t just the constant factor; LinkedList suffers from poor cache locality and extra object allocations per node.
Why This Matters
Understanding these quirks makes you a better coder in two concrete ways:
You stop over‑engineering. When you know
EnumSetexists, you won’t waste time writing a custom bit‑mask enum utility or worrying about the overhead ofHashSet<Enum>. You reach for the right tool instantly.You make informed trade‑offs. With
CopyOnWriteArrayListyou can decide whether its read‑heavy, write‑rare profile matches your use case. If it doesn’t, you’ll look atConcurrentHashMapkeyed by listener IDs or a lock‑free queue instead of blindly copying the list and hoping for the best.
These aren’t academic curiosities; they’re the sort of details that show up in production profiling sessions, code review comments, and interview questions. Spotting them early saves you hours of debugging and gives you credibility when you explain why a certain collection was chosen.
Challenge
Take a look at your current codebase: find a place where you’re using HashSet<SomeEnum> or ArrayList<Thread> for a read‑mostly, write‑occasionally scenario. Replace it with EnumSet or CopyOnWriteArrayList (or decide why you shouldn’t). Then run a quick micro‑benchmark or, better yet, observe GC logs before and after. Drop your findings in the comments – I’m curious to see where these hidden gems make the biggest impact for you.
Top comments (0)