DEV Community

Tapas Pal
Tapas Pal

Posted on • Edited on

How ConcurrentHashMap works Internally

HashMap is non-thread-safe and optimized for single-threaded performance, while ConcurrentHashMap is designed for concurrent access using fine-grained synchronization and CAS operations. HashMap is preferable for local non-shared data, whereas ConcurrentHashMap is ideal for shared mutable state in multi-threaded applications.

What is ConcurrentHashMap?

ConcurrentHashMap is a thread-safe version of HashMap designed for high concurrency.

HashMap Problem

Suppose two threads update a normal HashMap simultaneously:

Map<String, Integer> map = new HashMap<>();
Thread-1: map.put("A", 1);
Thread-2: map.put("B", 2);
Enter fullscreen mode Exit fullscreen mode

Since HashMap is not thread-safe, data corruption or lost updates can occur.

How ConcurrentHashMap Solves It Java 7

Used Segment-based locking.

ConcurrentHashMap
 ├── Segment 0 (Lock)
 ├── Segment 1 (Lock)
 ├── Segment 2 (Lock)
 └── Segment 3 (Lock)
Enter fullscreen mode Exit fullscreen mode

Only the segment being modified gets locked.

Java 8+ (Current Implementation)

Segments were removed.
Uses:

  • CAS (Compare-And-Swap)
  • Synchronized locking on individual buckets when needed
  • Volatile reads
  • Fine-grained locking

This significantly improves performance.
Internal Structure
Imagine:

ConcurrentHashMap<Integer, String> map = new ConcurrentHashMap<>();
Enter fullscreen mode Exit fullscreen mode

Internally:

Index 0 -> Node
Index 1 -> Node
Index 2 -> Node
Index 3 -> Node
...
Enter fullscreen mode Exit fullscreen mode

Each bucket contains a linked list or red-black tree.
Bucket 5
A -> B -> C

Detailed Example
Let's assume:

map.put(1, "Java");
map.put(17, "Spring");
Enter fullscreen mode Exit fullscreen mode

Suppose both keys hash to Bucket 1.

Internal structure:

Bucket 1
1=Java -> 17=Spring
Enter fullscreen mode Exit fullscreen mode

Scenario - Two threads execute simultaneously:

Thread-1:
map.put(33, "Kafka");

Thread-2:
map.put(49, "AWS");
Enter fullscreen mode Exit fullscreen mode

Assume both keys also hash to Bucket 1.
Step 1: Calculate Bucket

Both threads compute:

index = hash(key) & (table.length - 1)
Enter fullscreen mode Exit fullscreen mode

Result: Bucket 1
Both want to modify the same bucket.

Step 2: First Thread Acquires Lock

Thread-1 reaches Bucket 1 first.
Internally:

synchronized(firstNode) {
    // insert new node
}

Enter fullscreen mode Exit fullscreen mode

Bucket:

1=Java -> 17=Spring
Enter fullscreen mode Exit fullscreen mode

Thread-1 locks the first node.
LOCKED by Thread-1

Step 3: Thread-2 Arrives
Thread-2 also wants Bucket 1.

map.put(49, "AWS");
Enter fullscreen mode Exit fullscreen mode

It tries to acquire the same bucket lock.
Bucket 1 locked --> Thread-2 waits.

Step 4: Thread-1 Updates
Thread-1 adds:

1=Java
   ↓
17=Spring
   ↓
33=Kafka
Enter fullscreen mode Exit fullscreen mode

Lock released.

Step 5: Thread-2 Continues

Now Thread-2 acquires the bucket lock.
Current bucket:

1=Java
   ↓
17=Spring
   ↓
33=Kafka
Enter fullscreen mode Exit fullscreen mode

Adds:

1=Java
   ↓
17=Spring
   ↓
33=Kafka
   ↓
49=AWS
Enter fullscreen mode Exit fullscreen mode

Lock released.

Important Observation
Only Bucket 1 was locked.

Other buckets remain available.

Bucket 0 -> Free
Bucket 2 -> Free
Bucket 3 -> Free
Bucket 4 -> Free
Enter fullscreen mode Exit fullscreen mode

If another thread writes to Bucket 4:

map.put(100, "Docker");
Enter fullscreen mode Exit fullscreen mode

it proceeds without waiting.
This is why ConcurrentHashMap performs much better than Hashtable.

CAS Optimization

For an empty bucket:

map.put(10, "Java");
Enter fullscreen mode Exit fullscreen mode

No lock is taken initially.
Instead it uses CAS:
compareAndSwap()

Pseudo-code:

if(bucket is empty) {
    CAS(null, newNode);
}
Enter fullscreen mode Exit fullscreen mode

CAS is a CPU-level atomic operation.

This avoids locking completely for many operations.
Read Operations - Reads usually do not require locking.

map.get(10);
Enter fullscreen mode Exit fullscreen mode

The thread:

  1. Computes hash
  2. Finds bucket
  3. Traverses nodes
  4. Returns value

No lock is acquired in most cases.
That's why reads are extremely fast.
---------------------------------------------------------------

What is the difference between segment-level locking and CAS-level locking?

For ConcurrentHashMap, the difference between segment-level locking (Java 7) and CAS + bucket-level locking (Java 8+) is a very common interview question.

1. Segment-Level Locking (Java 7)

In Java 7, the map was divided into multiple Segments.
Think of a segment as a mini hash table with its own lock.

ConcurrentHashMap

Segment 0 (Lock 0)
 ├─ Bucket 0
 ├─ Bucket 1
 └─ Bucket 2

Segment 1 (Lock 1)
 ├─ Bucket 3
 ├─ Bucket 4
 └─ Bucket 5

Segment 2 (Lock 2)
 ├─ Bucket 6
 ├─ Bucket 7
 └─ Bucket 8
Enter fullscreen mode Exit fullscreen mode

Example

map.put("A", 1);  // Segment 0
map.put("B", 2);  // Segment 0
map.put("C", 3);  // Segment 1
Enter fullscreen mode Exit fullscreen mode

Thread 1

map.put("A", 100);
Enter fullscreen mode Exit fullscreen mode

Locks: Segment 0
Thread 2
map.put("B", 200);
Also needs:
Segment 0
It must wait even if "A" and "B" are in different buckets inside Segment 0.

Thread 1 --> LOCK Segment 0
Thread 2 --> WAITING
Enter fullscreen mode Exit fullscreen mode

Thread 3

map.put("C", 300);
Enter fullscreen mode Exit fullscreen mode

Uses Segment 1.
Thread 3 --> LOCK Segment 1
Can run concurrently.

Limitation - A whole segment gets locked.

Even if there are 100 buckets inside that segment, only one writer can modify anything in that segment at a time.

2. CAS + Bucket-Level Locking (Java 8+)
Java 8 removed Segments completely.
Structure:

Bucket 0
Bucket 1
Bucket 2
Bucket 3
Bucket 4
Enter fullscreen mode Exit fullscreen mode

Each bucket can be independently synchronized when needed.

Case 1: Empty Bucket

Suppose:

map.put("Java", "Spring");
Enter fullscreen mode Exit fullscreen mode

Bucket 5 is empty. Instead of locking:

synchronized(...) {
   insert
}
Enter fullscreen mode Exit fullscreen mode

ConcurrentHashMap uses CAS.
Pseudo-code:

if (bucket == null) {
    CAS(null, newNode);
}
Enter fullscreen mode Exit fullscreen mode

What CAS Means
CAS = Compare And Swap

Expected value = null
New value = Node(Java)
Enter fullscreen mode Exit fullscreen mode

CPU executes atomically:

If current value == null
    insert node
Else
    fail
Enter fullscreen mode Exit fullscreen mode

No lock involved.
Example of CAS
Initial:

Bucket 5 = null
Thread 1
put("Java", value)
Enter fullscreen mode Exit fullscreen mode

CAS:

null -> Node(Java)
Enter fullscreen mode Exit fullscreen mode

Success.

Thread 2
At the same time:

put("Spring", value)
Enter fullscreen mode Exit fullscreen mode

CAS checks:

Expected = null
Actual   = Node(Java)
Enter fullscreen mode Exit fullscreen mode

Fails.
Thread 2 retries.
No lock was used.

Case 2: Bucket Already Has Data

Suppose: Bucket 5
Java -> Kafka

Now:

put("AWS", value)
Enter fullscreen mode Exit fullscreen mode

Needs linked-list modification.
ConcurrentHashMap locks only that bucket.
Internally similar to:

synchronized(firstNode) {
    // update linked list
}
Enter fullscreen mode Exit fullscreen mode

Comparison Example
Assume:

Segment 0
 ├ Bucket A
 ├ Bucket B
 └ Bucket C
Enter fullscreen mode Exit fullscreen mode

Java 7

Thread 1: put(keyA)
locks: Segment 0

Thread 2: put(keyB)
also needs: Segment 0
Enter fullscreen mode Exit fullscreen mode

Must wait.

Thread 1 ---> Running
Thread 2 ---> Waiting
Enter fullscreen mode Exit fullscreen mode

Java 8

Thread 1: put(keyA)
locks only: Bucket A

Thread 2: put(keyB)
locks only: Bucket B

Both proceed simultaneously.

Thread 1 ---> Running
Thread 2 ---> Running
Enter fullscreen mode Exit fullscreen mode

Much higher concurrency.

Top comments (0)