DEV Community

Samuel Ochaba
Samuel Ochaba

Posted on

Why Your Python Code Takes Hours Instead of Seconds (A 3-Line Fix)

Your code works. It passes every test. Then you run it on real data and go make coffee. Then lunch. Then you start questioning your career choices.

Here's the pattern that's probably killing your performance:

def has_duplicates(items):
    for i in range(len(items)):
        for j in range(i + 1, len(items)):
            if items[i] == items[j]:
                return True
    return False
Enter fullscreen mode Exit fullscreen mode

This checks for duplicates by comparing every pair. Perfectly logical. Perfectly correct. Perfectly devastating at scale.

The Math That Ruins Your Day

List Size Comparisons Time
100 4,950 Instant
1,000 499,500 Instant
10,000 49,995,000 Noticeable
100,000 ~5 billion Many seconds
1,000,000 ~500 billion Hours

That nested loop means comparisons grow with the square of your data. Double the input, quadruple the time.

The 3-Line Fix

def has_duplicates(items):
    return len(items) != len(set(items))
Enter fullscreen mode Exit fullscreen mode

This version is essentially instant even for a million items.

Why? Converting to a set is O(n) β€” you touch each element once. Checking length is O(1). No nested loops, no quadratic explosion.

The Pattern to Recognize

Whenever you see nested loops over the same collection, alarm bells should ring:

# 🚨 O(nΒ²) β€” danger zone
for item in collection:
    if item in other_list:  # This is a hidden loop!
        ...

# βœ… O(n) β€” convert first
other_set = set(other_list)
for item in collection:
    if item in other_set:  # O(1) lookup
        ...
Enter fullscreen mode Exit fullscreen mode

That innocent if item in other_list is doing a linear scan each time. With a set, it's a hash lookup β€” constant time regardless of size.

The Real-World Impact

This isn't academic. In AI engineering, you're constantly:

  • Deduplicating embeddings
  • Checking if items exist in large collections
  • Processing batches where nested loops sneak in

The difference between O(nΒ²) and O(n) is the difference between "works on my machine" and "works in production."


This is adapted from my upcoming book, Zero to AI Engineer: Python Foundations.

I share excerpts like this on Substack β†’ https://substack.com/@samuelochaba

Top comments (0)