realNameHidden

Posted on Nov 4

How Does HashMap Handle Hash Collisions Internally

#java #interview #collection

Learn how HashMap handles hash collisions internally in Java using buckets, linked lists, and trees. Understand rehashing, examples, and best practices.

🧩 Introduction

Imagine a busy post office sorting thousands of letters into pigeonholes based on zip codes. Each pigeonhole represents a unique slot for a region. But what if two letters have the same zip code? They’ll go into the same slot — and that’s okay because the post office knows how to manage them efficiently.

This is exactly how HashMap in Java works internally.

When two different keys produce the same hash value, it’s called a hash collision. Understanding how Java’s HashMap handles these collisions is vital for writing efficient and reliable code, especially if you work with large datasets or performance-critical applications.

In this article, you’ll learn how HashMap manages collisions internally — from linked lists to balanced trees — with simple examples and analogies.

⚙️ Core Concepts

Before diving into collisions, let’s understand the basic structure of a HashMap.

1. How a HashMap Stores Data

A HashMap stores key-value pairs in buckets (think of them as pigeonholes).
When you insert a key-value pair:

The key’s hashCode() method is called.
The hash value is processed using internal logic (bitwise operations).
The resulting index determines which bucket the entry goes into.

If two keys generate the same bucket index, a collision occurs.

2. What is a Hash Collision?

A hash collision happens when two keys produce the same hash and therefore map to the same bucket index.

Example:

"AB".hashCode() == "BA".hashCode() // Hypothetical example

Even though the keys are different, they land in the same bucket — creating a collision.

Collisions are inevitable because the hash space is finite but the number of possible keys is infinite.

3. How HashMap Handles Collisions (Java 8 and Above)

Java’s HashMap uses a combination of Linked Lists and Red-Black Trees to efficiently handle collisions.

🧱 Step 1: Linked List Chaining

When a bucket collision occurs:

The new key-value pair is added to the linked list inside that bucket.
The list stores all entries that share the same hash index.

This approach is called chaining.

However, if too many collisions occur, this list can grow long — degrading performance to O(n) for lookups.

🌳 Step 2: Treeify for Better Performance

To solve this, Java 8 introduced treeification:

If a bucket contains more than 8 entries, the linked list is converted into a Red-Black Tree.
Tree operations like search, insert, and delete then work in O(log n) time.

When the number of entries in a bucket drops below 6, the tree converts back into a linked list — a process called untreeification.

This balance ensures that HashMap remains both fast and memory-efficient.

4. Visual Analogy

Think of buckets like parking lots:

Normally, each car (key) gets its own spot (bucket).
If two cars accidentally get assigned the same spot (collision), they queue up (linked list).
If the queue gets too long, the system builds a multi-level parking structure (tree) to organize them efficiently.

💻 Code Examples (Java 21)

Example 1: Demonstrating Hash Collisions

import java.util.HashMap;

public class HashCollisionExample {
    public static void main(String[] args) {
        HashMap<Key, String> map = new HashMap<>();

        // Two different keys but same hashCode (intentional collision)
        Key key1 = new Key(1);
        Key key2 = new Key(2);

        map.put(key1, "Value for Key1");
        map.put(key2, "Value for Key2");

        // Retrieve both values
        System.out.println(map.get(key1)); // Value for Key1
        System.out.println(map.get(key2)); // Value for Key2
    }
}

// Custom Key class with same hashCode for demonstration
class Key {
    private int id;

    public Key(int id) {
        this.id = id;
    }

    @Override
    public int hashCode() {
        return 100; // Force hash collision
    }

    @Override
    public boolean equals(Object obj) {
        if (this == obj) return true;
        if (!(obj instanceof Key)) return false;
        return this.id == ((Key) obj).id;
    }
}

📝 Explanation:
Even though both keys have the same hash code (collision), the equals() method ensures they’re stored and retrieved correctly in the HashMap. This shows how HashMap internally manages collisions safely.

Example 2: Observing Treeification (Conceptual Demonstration)

import java.util.HashMap;

public class TreeifyExample {
    public static void main(String[] args) {
        HashMap<Integer, String> map = new HashMap<>();

        // Insert multiple elements with the same hash bucket index
        // (This won’t always treeify in small examples but illustrates the concept)
        for (int i = 0; i < 20; i++) {
            map.put(i * 16, "Value" + i);
        }

        System.out.println("Inserted 20 elements with potential hash collisions.");
        System.out.println("HashMap automatically optimizes performance internally!");
    }
}

📝 Explanation:
When multiple entries fall into the same bucket, Java automatically decides whether to maintain a linked list or convert it into a Red-Black Tree based on the threshold — optimizing lookup time.

✅ Best Practices for Working with HashMap Collisions

Always Override hashCode() and equals() Together
If you override one without the other, collisions can cause data inconsistency or retrieval failures.
Use High-Quality Hash Functions
Poorly designed hashCode() methods lead to frequent collisions, slowing down performance.
Avoid Using Mutable Keys
Changing a key’s field after insertion can break retrieval logic because its hash may change.
Monitor Performance in Large Maps
In performance-critical systems, frequent rehashing or collisions can degrade throughput — consider ConcurrentHashMap for multi-threaded scenarios.
Understand Default Thresholds
Knowing when treeification happens (8 entries per bucket) helps you predict memory and performance behavior.

🏁 Conclusion

Understanding how HashMap handles hash collisions internally is key to mastering Java programming at a deeper level.

Hash collisions aren’t bad — they’re expected. What matters is how HashMap handles them efficiently:

First with linked lists (simple chaining)
Then with Red-Black Trees (balanced search structure)

This design ensures that even under heavy data loads, lookups and insertions remain fast and reliable.

So next time you use a HashMap, you’ll know exactly what’s happening behind the scenes — Java is quietly organizing your data like a smart postmaster!

DEV Community