Matteo

Posted on Nov 11

Sorting Algorithms Part 2: Divide and Conquer — Merge, Quick, and Heap Sort

#algorithms #python #interviewprep #computerscience

1. Introduction. The hook of today

In Part 1, we saw the world of O(n²) sorts: simple, intuitive, but inefficient for large data. Now we step into a higher league: algorithms that use divide and conquer to reach O(n log n) time.

Today we’ll explore:

Merge Sort: stable and elegant, built on recursive merging.
Quick Sort: practical and blazing fast on average.
Heap Sort: in-place and guaranteed O(n log n).

These three mark the foundation of modern sorting.
Like in the previous blog, you can experiment interactively with all sorting algorithms in the Sorting Algorithms Playground.

2. Divide and Conquer (from Latin, "Divide et Impera") at its core

The key idea behind all O(n log n) sorts is simple:

Divide the array into smaller parts.
Conquer by sorting each part recursively.
Combine the results efficiently.

Each split halves the array (log n divisions), and each merge or partition touches all elements (n work). Hence, n × log n.

3. Merge Sort

3.1 Idea

Merge Sort recursively splits an array into halves until single elements remain, then merges them in sorted order.

It’s:

Stable: equal elements preserve order.
Not in-place: needs extra memory for merging.
Deterministic: always O(n log n), best to worst.

3.2 Implementation

def merge_sort(arr):
    if len(arr) <= 1:
        return arr

    mid = len(arr) // 2
    left = merge_sort(arr[:mid])
    right = merge_sort(arr[mid:])

    return merge(left, right)

def merge(left, right):
    result = []
    i = j = 0

    while i < len(left) and j < len(right):
        if left[i] <= right[j]:
            result.append(left[i])
            i += 1
        else:
            result.append(right[j])
            j += 1

    result.extend(left[i:] or right[j:])
    return result

3.3 Step-by-step explanation

Split recursively until subarrays of size 1.
Merge pairs of sorted lists by comparing the smallest elements from each.
Repeat merging bottom-up until the whole array is sorted.

3.4 Complexity

3.5 Why it matters

Merge Sort’s structure inspires Timsort, Python’s default sort, which combines Merge and Insertion Sort for optimal performance on real-world data.

4. Quick Sort

4.1 Idea

Quick Sort partitions the array around a pivot, placing smaller elements on one side and larger on the other, then sorts both sides recursively.

It’s fast in practice, especially with good pivot selection, but unstable.

Performance depends on partitioning:

Balanced splits → O(n log n)
Worst case (already sorted) → O(n²)

4.2 The simple recursive version (Pythonic)

def quick_sort(arr):
    if len(arr) <= 1:
        return arr

    pivot = arr[-1]
    left = quick_sort([x for x in arr[:-1] if x <= pivot])
    right = quick_sort([x for x in arr[:-1] if x > pivot])

    return left + [pivot] + right

Readable and expressive, but not memory efficient, since it creates new lists each time.

4.3 In-place quicksort (Lomuto partition)

def partition_lomuto(arr, low, high):
    pivot = arr[high]
    i = low - 1
    for j in range(low, high):
        if arr[j] < pivot:
            i += 1
            arr[i], arr[j] = arr[j], arr[i]
    arr[i + 1], arr[high] = arr[high], arr[i + 1]
    return i + 1

def quick_sort_in_place_lomuto(arr, low=0, high=None):
    if high is None:
        high = len(arr) - 1
    if low < high:
        p = partition_lomuto(arr, low, high)
        quick_sort_in_place_lomuto(arr, low, p - 1)
        quick_sort_in_place_lomuto(arr, p + 1, high)
    return arr

4.4 Step By Step

The Lomuto scheme always chooses the last element as the pivot. It scans the array once, keeping a boundary i for the last element smaller than the pivot.
Every time we find an element smaller than the pivot, it’s swapped forward. In the end, we place the pivot right after that boundary, which ensures that everything to its left is smaller and everything to its right is larger.
The method is elegant but can suffer if the pivot ends up at the extremes (like already sorted input), producing highly unbalanced partitions.

4.5 Hoare partition (fewer swaps)

def partition_hoare(arr, low, high):
    pivot = arr[(low + high) // 2]
    i, j = low - 1, high + 1
    while True:
        i += 1
        while arr[i] < pivot:
            i += 1
        j -= 1
        while arr[j] > pivot:
            j -= 1
        if i >= j:
            return j
        arr[i], arr[j] = arr[j], arr[i]

def quick_sort_in_place_hoare(arr, low=0, high=None):
    if high is None:
        high = len(arr) - 1
    if low < high:
        p = partition_hoare(arr, low, high)
        quick_sort_in_place_hoare(arr, low, p)
        quick_sort_in_place_hoare(arr, p + 1, high)
    return arr

4.6 Step By Step

Hoare’s scheme uses two pointers moving inward from both ends of the array. The pivot is typically the middle element, and the loop continues swapping out-of-place values until the pointers cross.
Unlike Lomuto, Hoare’s version does fewer swaps and keeps partitions tighter, but the pivot itself may remain in the middle rather than moving to its final position.

The two approaches illustrate how small structural differences affect both stability and efficiency.

4.7 Variants worth knowing

Randomized pivot: avoids worst-case scenarios by choosing a random pivot.
Median-of-three: pivot = median(first, middle, last).
Three-way partition: groups <, =, and > pivot values separately (useful for arrays with duplicates).

4.8 Complexity summary

4.9 Why it matters

Quick Sort is the default choice in practice (e.g., C’s qsort, Java’s Arrays.sort for primitives) thanks to its cache efficiency and low overhead.
With randomized pivots and tail recursion elimination, it becomes extremely robust.

5. Heap Sort

5.1 Idea

Heap Sort uses a binary heap to repeatedly extract the maximum element and place it at the end.

A heap is a specialized binary tree stored in an array, where each parent satisfies a specific order property with respect to its children:

In a max-heap, every parent is greater than or equal to its children.
In a min-heap, every parent is less than or equal to its children.

Because the tree is complete (all levels filled except possibly the last), a heap can be represented in-place: for a node at index i, its children live at 2*i+1 and 2*i+2.

The first for loop in heap_sort starts from n // 2 - 1, the index of the last non-leaf node. All nodes after that index are leaves and already satisfy the heap property. Starting from there and moving upward ensures every subtree is heapified before the parent is adjusted.

It’s fully in-place and guarantees O(n log n) performance.

5.2 Implementation

def heapify(arr, n, i):
    largest = i
    left, right = 2 * i + 1, 2 * i + 2

    if left < n and arr[left] > arr[largest]:
        largest = left
    if right < n and arr[right] > arr[largest]:
        largest = right

    if largest != i:
        arr[i], arr[largest] = arr[largest], arr[i]
        heapify(arr, n, largest)

def heap_sort(arr):
    n = len(arr)

    for i in range(n // 2 - 1, -1, -1):
        heapify(arr, n, i)

    for end in range(n - 1, 0, -1):
        arr[0], arr[end] = arr[end], arr[0]
        heapify(arr, end, 0)

    return arr

5.3 Step-by-step

The algorithm first builds a max-heap, guaranteeing that the largest element is at the root. It then repeatedly swaps that root with the last element, effectively placing it in its correct final position, and reduces the heap size by one. Each call to heapify restores the heap structure for the remaining elements. In the end, the array becomes sorted in ascending order.

5.4 Complexity

5.5 Trade-offs

Always O(n log n), but more comparisons than Quick Sort.
Useful when guaranteed upper bounds are needed (e.g., real-time systems).

6. Comparing all log-linear sorts

7. Quick comprehension check

Questions

Which of these algorithms is stable by default?
Which is the fastest on average in practice?
Which is guaranteed O(n log n) and in-place?

Answers

Merge Sort.
Quick Sort (randomized).
Heap Sort.

8. Final summary and review

Log-linear sorting algorithms are where theory meets practice.
Merge Sort: perfect stability and predictability, but extra memory.
Quick Sort: elegant recursion and outstanding average speed.
Heap Sort: reliable and space-efficient, ideal when guarantees matter.

Together, they form the foundation for hybrid sorts like Timsort (Merge + Insertion) and IntroSort (Quick + Heap).

9. Hook for tomorrow

Next, we’ll enter the realm of linear-time sorting, when comparisons aren’t needed at all.
We’ll explore Counting Sort, Radix Sort, and Bucket Sort, showing how data properties can unlock sub-logarithmic performance.

DEV Community