LeetCode #217. Contains Duplicate

Average case: O(n)

Single loop through the array (n iterations)
HashSet contains() and add() operations are O(1) on average
Total: n × O(1) = O(n)

Worst case: O(n²)

If there are many hash collisions, HashSet operations degrade to O(n)
This happens rarely with a good hash function
Total: n × O(n) = O(n²)

class Solution {
    public boolean containsDuplicate(int[] nums) {

        Set<Integer> numSet = new HashSet<>();
        for (int i = 0; i < nums.length; i++) {
            if (numSet.contains(nums[i])) {
               return true; 
            }
            numSet.add(nums[i]);
        }
        return false;
    }
}

hash collisions and why they cause O(n²) worst case:

How HashSet normally works (O(1)):

HashSet uses a hash function to convert your value into an array index
hash("apple") → index 5, store "apple" at bucket[5]
To check if "apple" exists: hash("apple") → index 5, check bucket[5]
Direct access = O(1)

What are hash collisions?
When different values hash to the same index:

hash("apple") → index 5
hash("banana") → index 5 (collision!)
Both values want to go in bucket[5]

How HashSet handles collisions:
HashSet stores colliding elements in a linked list (or tree) at that bucket:

bucket[5] → [apple] → [banana] → [cherry] → ...

Why this causes O(n) operations:

To find "cherry", you must traverse: apple → banana → cherry
If ALL elements hash to the same bucket, you get one long chain
Searching this chain takes O(n) time

Worst case scenario:

All n elements hash to the same bucket (terrible hash function)
Each contains() operation searches the entire chain = O(n)
You do this n times in your loop = O(n × n) = O(n²)

Why it's rare:

Good hash functions distribute values evenly across buckets
Java's HashSet uses sophisticated hashing to minimize collisions
In practice, you almost always get O(1) average performance

DEV Community

LeetCode #217. Contains Duplicate

hash collisions and why they cause O(n²) worst case:

Top comments (0)