DEV Community

Tomer Ben David
Tomer Ben David

Posted on

How to Actually use Python's heapq for Kth Largest Problems

If you're using Python for coding interviews, heapq is your best choice for priority queues. But it has a massive quirk that trips up almost everyone. It only supports min heaps.

If you try to use heapq.heapify_max(), your code will crash on most platforms (it's not fully public until Python 3.14).

So, how do you find the Kth largest element if you only have a min heap?

There is a brute force way, and there is the way interviewers actually want to see.

Brute force with negation

Since heapq always puts the smallest element at index 0, you can fake a max heap by making all your numbers negative. The largest positive number becomes the smallest negative number.

import heapq

nums = [3, 2, 1, 5, 6, 4]
max_heap = [-x for x in nums]
heapq.heapify(max_heap)

# The root is now -6
largest = -max_heap[0] 
Enter fullscreen mode Exit fullscreen mode

This works fine for small arrays. But if an interviewer asks you to get the top 100 values from a stream of a billion numbers, storing every single number in memory is extremely inefficient. You need a better strategy.

The efficient Min heap strategy

Instead of putting all the numbers into a max heap, put exactly K numbers into a min heap.

Think of it like keeping a running "Top 10" list. The root of a min heap (heap[0]) is always the smallest element. If your heap is exactly size K, the root is the smallest of your top K numbers.

As you stream through the rest of the data, if you see a new number that is bigger than your root, it belongs in the Top K. You kick the root out, and put the new number in.

First, you start by creating a heap with only the first K elements.

import heapq

def find_kth_largest(nums: list[int], k: int) -> int:
    # Start our list with the first K elements
    heap = nums[:k]
    heapq.heapify(heap) 
Enter fullscreen mode Exit fullscreen mode

Then you iterate through the remaining numbers. If a new number is larger than the root of our heap, it means the root is no longer in the Top K. You replace it.

    # Go through the rest of the numbers
    for i in range(k, len(nums)):
        if nums[i] > heap[0]:
            heapq.heapreplace(heap, nums[i])
Enter fullscreen mode Exit fullscreen mode

Finally, the root of your heap will be the Kth largest element overall.

    return heap[0]
Enter fullscreen mode Exit fullscreen mode

Why Interviewers Care

This exact pattern solves the massive streaming data problem perfectly.

Because you only ever store K elements at a time, your Space Complexity is O(K). It takes virtually zero memory.

Your Time Complexity is O(N log K). You look at every number once (N), and occasionally do a heap replacement operation that takes logarithmic time based on the small size of K.

So next time you are asked for the K largest items, do not reach for a max heap. Use a min heap, cap it at size K, and only let the big numbers in.

Top comments (0)