Introduction
- This post covers binary heaps as a complete binary tree stored in an array, where each parent follows a priority rule (min-heap or max-heap). It explains the core operations
push(heapifyUp) andpop(heapifyDown), the heap property, and provides an implementation. - It also covers trade-offs of using heaps as priority queues, real use in greedy algorithms like Dijkstra’s algorithm and Huffman coding, O(n) in-place heap construction, plus heapsort, and a discussion about introsort.
Binary Heap
- A binary heap is a special case of a heap with the following properties:
- It is a complete binary tree (every level is full except the last, which is filled from left to right).
- Being a complete binary tree implies:
- If a node is at index
i: - Left child =
2*i + 1 - Right child =
2*i + 2 - Parent =
(i - 1) / 2(if not root)
- If a node is at index
- Heap property (what differentiates heaps):
- The parent has higher priority than its children
- Example: min-heap (highest priority = smallest value)
- Parent ≤ both children
- As you can see, the parent always has the highest priority.
- This implies that the root (first node) always has the highest priority.
K-ary Heap
- We could have a k-ary heap, the properties stay the same, only changes are:
- If a node is at index
i: - first child =
k*i + 1 - second child =
k*i + 2 - k-th child =
k*i + k - Parent =
(i - 1) / k(if not root)
- If a node is at index
Pseudocode for Binary Heap → MinHeap
- We save heap items into an array:
items = [] - 1.
PUSHinserts new nodes- a. Insert at end of array
items.push(newItem), addedItemIndex = i - b. Now if we have more than 1 element we have to check property validity:
- i. parent ≤ child
- ii. if true → valid, do nothing
- iii. if false → invalid,
swap(parent, child) - iv. repeat until reach valid heap state.
- We generally call this
b stepof fixing the tree when inserting asheapifyUp. -
heapifyUpbecause we start from the bottom (after adding at the end of the heap), and fix the heap bottom-to-top by comparing the node with its parent.
- a. Insert at end of array
- 2.
POPget root (highest priority element)- a. Remove root node and put the last added on its place.
- b. Now if we have more than 1 element we have to check property validity:
- i. parent ≤ both children
- ii. if true → valid, do nothing
- iii. if false → invalid,
swap(parent, smallestChild) - iv. repeat until reach valid heap state.
- Here we call it
heapifyDownsince we start on root and go down, comparing the parent with its children.
Binary Heap Code
- Huffman Code in C - contains minHeap implementation within it
- JS implementation below: Github repository
class BinaryHeap {
#items = [];
#hasHigherPriorityFn;
constructor({ items, hasHigherPriorityFn }) {
if (!(items && items.length)) { throw Error("'items' field, is required") }
// no types here in JS, so we trust hasHigherPriorityFn function and items array -> the focus is binary heap
if (!hasHigherPriorityFn || typeof hasHigherPriorityFn !== "function") { throw Error("'hasHigherPriorityFn' function is required") }
this.#hasHigherPriorityFn = hasHigherPriorityFn;
this.#items = items;
this.#buildHeapInPlace(this.#items);
}
#swap(addedIndex, parentIndex) {
const temp = this.#items[addedIndex];
this.#items[addedIndex] = this.#items[parentIndex];
this.#items[parentIndex] = temp;
}
// time: O(log n), space O(log n) -> recursion, iterative is O(1) space
#heapifyUp({ insertedItemCurrIndex }) {
const parentIndex = Math.floor((insertedItemCurrIndex - 1) / 2);
if (parentIndex < 0 || insertedItemCurrIndex == parentIndex) {
return;
}
const insertedItem = this.#items[insertedItemCurrIndex];
const shouldSwap = this.#hasHigherPriorityFn(insertedItem, this.#items[parentIndex]);
if (shouldSwap) {
this.#swap(insertedItemCurrIndex, parentIndex);
this.#heapifyUp({ insertedItemCurrIndex: parentIndex });
}
}
// time: O(log n), space O(log n) -> recursion, iterative is O(1) space
#heapifyDown({ currItemPosition }) {
let mostPriorityItemIndex = currItemPosition;
const leftChildIndex = 2 * currItemPosition + 1;
const rightChildIndex = 2 * currItemPosition + 2;
const leftChildExists = leftChildIndex < this.#items.length;
if (leftChildExists && this.#hasHigherPriorityFn(this.#items[leftChildIndex], this.#items[mostPriorityItemIndex])) {
mostPriorityItemIndex = leftChildIndex;
}
const rightChildExists = rightChildIndex < this.#items.length;
if (rightChildExists && this.#hasHigherPriorityFn(this.#items[rightChildIndex], this.#items[mostPriorityItemIndex])) {
mostPriorityItemIndex = rightChildIndex;
}
// currItem = parent (start at index 0)
// if parent already has the highest priority, do nothing
// otherwise swap with highest priority child until heap is valid again
if (mostPriorityItemIndex != currItemPosition) {
this.#swap(currItemPosition, mostPriorityItemIndex);
this.#heapifyDown({ currItemPosition: mostPriorityItemIndex });
}
}
// O(n) because most heapifyDown calls run on small-height nodes.
#buildHeapInPlace(arr) {
const arrSize = arr.length;
for (let i = Math.floor(arrSize / 2) - 1; i >= 0; i--) {
this.#heapifyDown({ currItemPosition: i });
}
}
// O(log n)
push({ item }) {
this.#items.push(item);
if (this.#items.length >= 2) {
this.#heapifyUp({ insertedItemCurrIndex: this.#items.length - 1 });
}
}
// O(log n)
pop() {
const currLength = this.#items.length;
if (currLength == 0) { return null; }
const itemToReturn = this.#items[0];
this.#items[0] = this.#items[currLength - 1]; // last item is new root
this.#items.pop(); // remove last item from array
if (currLength > 1) {
this.#heapifyDown({ currItemPosition: 0 }); // fix top down
}
return itemToReturn;
}
// O(1)
peek() {
if (!this.#items.length) { return null; }
return this.#items[0];
}
printItems() {
console.log('this.#items', this.#items);
}
consumeAll(callback) {
console.log(this.constructor.name, 'consumeAll:')
console.log('this.#hasHigherPriorityFn.toString()', this.#hasHigherPriorityFn.toString())
while (this.#items.length) {
callback(this.pop());
}
}
}
// smallest item at root index = 0
class MinHeap extends BinaryHeap {
constructor({ items }) {
super({ items, hasHigherPriorityFn: (itemA, itemB) => itemA < itemB });
}
}
// biggest item at root index = 0
class MaxHeap extends BinaryHeap {
constructor({ items }) {
super({ items, hasHigherPriorityFn: (itemA, itemB) => itemA > itemB });
}
}
console.log('--------------------------------------------------------')
const minHeapTest = new MinHeap({ items: [8, 7, 6, 5, 4, 3, 2, 1] });
minHeapTest.printItems();
minHeapTest.consumeAll(console.log)
console.log('--------------------------------------------------------')
const maxHeap = new MaxHeap({ items: [1, 2, 3, 4, 5, 6, 7, 8] });
maxHeap.printItems();
maxHeap.consumeAll(console.log)
console.log('--------------------------------------------------------')
const people = [{ name: "João", age: 22 }, { name: "Turing", age: 41 }, { name: "Ada Lovelace", age: 36 }]
const elderlyPeopleFirstHeap = new BinaryHeap({ items: people, hasHigherPriorityFn: (a, b) => a.age > b.age });
elderlyPeopleFirstHeap.consumeAll(console.log)
console.log('--------------------------------------------------------')
- OUTPUT:
Time Complexity
| Operation | Time complexity |
|---|---|
Push |
O(log n) |
Peek |
O(1) |
Pop |
O(log n) |
-
O(log n)→ Because complete binary tree contains:2^height ~ nheight = log n- It means that every time we fix the heap, the height decreases by one level at each step, and in the worst case it requires moving a node from the bottom (height = 0) to the top (height = log n), which is
O(log n)in the worst case.- Or the opposite direction, from top to bottom.
Use cases and trade-offs
- Commonly used to implement priority queues. Priority queues can also be implemented using other data structures such as arrays.
- But with a sorted array, for example, you would have
O(1) popandO(n) insert, since inserting a new item in the correct position may require shifting all items in the worst case.
- But with a sorted array, for example, you would have
-
To implement a priority queue
- Use a queue if you will set up all inserts at once and then only perform reads.
- Use a binary heap or k-ary heap if you require faster inserts
O(log n).
- Widely used in greedy algorithms that require extracting the highest priority item at each step from a priority queue:
- Huffman coding (greedy + min-heap) - C code here
- Dijkstra’s algorithm (greedy + min-heap) - C++ code here, developed by Matheus Persch.
- And more…
In place O(n) build max heap instead of O(n * log n)
-
// theory: all items after n/2 are leaves: A[n/2+1, ..., n] // practice: as indexes start at 0, n/2 is already a leaf // so we start at (n/2)-1 until 0 // --> from height 1 until root function buildHeapInPlace(arr) { const arrSize = arr.length; for (let i = Math.floor(arrSize / 2) - 1; i >= 0; i--) { heapifyDown({ items: arr, currItemPosition: i }); } } -
How do we know that items after n/2 are leaves?
- 1. We know that leaves have no children
- a. left child =
2*i+1; right child =2*i+2
- a. left child =
- 2. If has no children:
2*i > n2*i outside array boundnso there is no children- a. And then we find:
i > n/2, so all itemsA[(n/2)+1, …, n]are leaves
- a. And then we find:
- 1. We know that leaves have no children
-
Why do we start from
height=1and not from leaves atheight = 0?- index = n/2, height 1 = first non leaf node;
- Because leaves (h=0) are already valid heaps (don't require fix), and heapifyDown only works to fix a parent with valid heaps as children.
- In a heap, each operation fixes one node at a time.
-
Why do we go until
height=log n?-
height = 1→ first non-leaf node -
height = log n→ root - We fix the heap bottom-up.
- When we reach
height = log n, we have fixed the root, so the entire tree is a valid heap.
-
Why is it O(n) and not O(n *log n)?
- Both heapifyDown and heapifyUp have complexity of
O(log n) -
heapifyDown(node): max number of required movements = distance from node to the bottom of the tree (expensive if nodes are at the top) -
heapifyUp(node): max number of required movements = distance from node to the top of the tree (expensive if nodes are at the bottom) -
Given an array, what would you prefer? Start building a heap from top or bottom?
-
Starting at top: use
heapifyUp()in all nodes -
Starting at bottom: use
heapifyDown()in all nodes
-
Starting at top: use
Building the sum for each approach to analyze it better
- At height=0 we have n/2 nodes, at height=1 n/4, and so on…
- moves per level =
(nodes at that height) * (max movements each node can do) -
Starting at top with
heapifyUp(), the sum of movements/operations would be:tree height = h = log n(n/2 * h) + (n/4 * (h-1)) + (n/8 * (h-2)) + ... + (1 * 0)
- But starting at bottom with
heapifyDown(), the sum of operations is: (n/2 * 0) + (n/4 * 1) + (n/8 * 2) + ... + (1 * h)- Compare item by item in the both sums, you will notice, for example, that:
n/2 * logn > n/2 * 0- And starting at top makes much more movements for each level.
- But we still haven't answered:
why starting at bottom makes it O(n)?- If you have at least basic calculus knowledge it will be easier, but I'll not make a formal demonstration, only show an intuitive explanation.
k = current level
- If you have at least basic calculus knowledge it will be easier, but I'll not make a formal demonstration, only show an intuitive explanation.
- The sum doesn't depend on
nit converges to a constant. - So it would be n*O(1) → O(n)
- In summary, many nodes do small work, and few nodes do large work. The balance gives a linear total
O(n).- leaf nodes = 0 moves
- level 1 (above leaves) = at most 1 move
- …
- root = at most log n moves
- For more: watch starting at 30:00 - MIT OpenCourseWare - Lecture 4: Heaps and Heap Sort (Srini Devadas, 2011)
Use case for build heap in-place
-
If you already have an array, it is better to build the heap in-place using bottom-up heap construction than inserting items one by one and applying heapifyDown multiple times.
buildInPlaceMaxHeap(array); // O(n) // O(n*log n) for(item of array) { // O(n) existentMaxHeap.insert(item); // O(log n) } // When to use for (item of arr) existentMaxHeap.insert(item)? // 1. you don't receive the entire array at once, just streaming single items // 2. you already have a big existent heap // let's say your heap contains 10k items // and you receive 10 items at once // it isn't a good idea to rebuild the entire heap again
Heapsort
- I hope you can read that. Good luck.
- Just kidding, I'll write it down here:
- 1. Build a MaxHeap in
O(n)in place using an array.- a. The heap area starts from
0ton-1. (the entire array is a heap)
- a. The heap area starts from
- 2. Swap the ROOT and the last item, then decrease the heap area by one.
- a. The root was the biggest element (inside the heap area).
- b. Now the biggest element is at the end (the correct place in the sorted array).
- 3. Heapify Down to fix the ROOT.
- 4.
if (heapArea > 1) { return to step 2 } else { HALT }- a. In the handwritten note I put
heapArea > 0, but if only one element remains, this is the smallest one. (The paper is quite wrong.)
- a. In the handwritten note I put
- I really suggest you to watch the last minutes starting at 48:00 - MIT OpenCourseWare - Lecture 4: Heaps and Heap Sort (2011)
Heapsort code
-
Time complexity:
O(n * log n) -
Space complexity:
O(1) → iterative version - Github repository
function heapsort(items) { // O(n*log n)
buildHeapInPlace(items);
let heapAreaEnd = items.length - 1;
// while runs n-1 times = O(n)
while (heapAreaEnd > 1) {
swap(items, 0, heapAreaEnd);
--heapAreaEnd;
// O(log n)
iterativeHeapifyDown({ items, currItemPosition: 0, lastHeapIndex: heapAreaEnd });
// now heapifyDown receives the lastHeapIndex, to change only heapArea
}
// return items; // not needed since heapsort is in place
}
const arr = [1, 2, 9, 4, 5, 6];
console.log('before: arr', arr) // [ 1, 2, 9, 4, 5, 6 ]
heapsort(arr)
console.log('after arr', arr) // [ 1, 2, 4, 5, 6, 9 ]
// simplifying: get biggest item (root) and put it at the end of heap area
// now the last item is in the correct position
// --> fix new root that breaks heap property, heapifyDown()
// repeat until all items are in the correct position
Heapsort is an unstable sort:
Heapsort and Merge sort comparison
- If heapsort has the same Big-O time complexity as merge sort, why is merge sort generally faster?
- merge sort: time
O(n*log n), spaceO(n) -
Memory locality: heapsort jumps around memory (
left child = 2*i+1, ...) while merge sort uses sequential memory, taking advantage of cache locality and CPU cache. -
Comparisons and data movements: heapsort needs to repeatedly compute left and right children, compare multiple nodes using the max heap
a > bproperty, and swap elements. These are all constant factors that are not counted in Big-O notation, but in practice they matter.
- merge sort: time
EXTRA: Introsort and the use of heapsort
- Quicksort is known for being one of the fastest non-hybrid comparison sorting algorithms.
-
But there is no one size fits all in computer science. Some standard libs such as in c++ mix multiple sort algorithms for reaching a better hybrid sorting algorithm with best performance.
Algorithm Time (Average) Time (Worst) Space (Average) Space (Worst) Quicksort O(n log n)O(n^2)O(log n)O(n)Introsort O(n log n)O(n log n)O(log n)O(log n) -
Introsort is a hybrid sorting algorithm that uses:
-
insertion sortfor small arrays -
quicksortuntil a maximum recursion stack depth is reached -
heapsortafter reaching the maximum recursion stack depth- Why not merge sort, since it is better than heapsort? Because heapsort is in-place, so we can continue the work that quicksort has already done.
-
-
Pseudocode:
Introsort(A): depth_limit = 2 * floor(log2(n)) introsort(A, depth_limit) introsort(A, depth_limit): if size(A) <= 16: // small array -> insertion sort insertion_sort(A) return if depth_limit == 0: // reached recursion depth limit -> heapsort heapsort(A) return // fallback: quicksort pivot = partition(A) // using median-of-three introsort(left of pivot, depth_limit - 1) introsort(right of pivot, depth_limit - 1)
References
- MIT OpenCourseWare - Lecture 4: Heaps and Heap Sort (Srini Devadas, 2011)
- Stack Overflow - How can building a heap be O(n) time complexity?
- GeeksforGeeks - K-ary Heap
- Huffman Coding - Github Repository
- Dijkstra - Github Repository
- Binary Heap Implementation - Github Repository
- Build Heap In Place - Github Repository
- Heapsort Implementation - Github Repository
- https://en.wikipedia.org/wiki/Huffman_coding
- https://en.wikipedia.org/wiki/Dijkstra's_algorithm
- David R. Musser, Introspective Sorting and Selection Algorithms, Computer Science Department, Rensselaer Polytechnic Institute, Troy, NY.
- GeeksforGeeks - Introsort - C++’s Sorting Weapon





Top comments (0)