Whetlan

Posted on Apr 28

Compile-Time Sorting in C++ With Templates: Why Heapsort Falls Apart

#cpp #programming #algorithms #computerscience

Tried implementing sorting algorithms as pure template metaprogramming. Not constexpr, not consteval. The old way, where the compiler does the sorting during template instantiation and the "output" is a type.

Quicksort worked. Mergesort worked. Heapsort turned into selection sort.

That last part took me longer to understand than I'd like to admit.

The setup

Everything operates on a type like arr<5, 3, 8, 1>. There's no runtime array. The sorted result is another type, like arr<1, 3, 5, 8>, and you verify it with static_assert.

Basic building blocks first. A typelist and element access:

template<int... Vs>
struct arr {};

template<typename Arr, int N>
struct get {};

template<int A0, int... Args, int N>
struct get<arr<A0, Args...>, N> {
    static constexpr int value = get<arr<Args...>, N-1>::value;
};

template<int A0, int... Args>
struct get<arr<A0, Args...>, 0> {
    static constexpr int value = A0;
};

Already a problem here, but I didn't notice it yet. More on that later.

Quicksort

This one maps to TMP almost too well. Pick a pivot, filter into two sublists, recurse, concat.

The filter needs a predicate. I went with nested templates for partial application, which is ugly but works:

template<int R>
struct le {
    template<int L>
    struct le_partial {
        static constexpr bool value = L <= R;
    };
    template<int L>
    using value = le_partial<L>;
};

Then quicksort itself:

template<int A0, int... Args>
struct quicksort<arr<A0, Args...>> {
    template<int L>
    using lep = typename le<A0>::value<L>;

    template<int L>
    using gtp = typename gt<A0>::value<L>;

    using type = concat_t<
        quicksort_t<filter_t<lep, arr<Args...>>>,
        arr<A0>,
        quicksort_t<filter_t<gtp, arr<Args...>>>
    >;
};

No indexing. No swaps. Just partition by predicate and concat. Stays O(n log n) in template instantiations (assuming decent pivot, same caveat as regular quicksort).

Mergesort

Split the list in half, sort each half, merge. Also maps well to TMP.

I used left and right helpers to split the typelist (basically take and drop). The merge step compares heads:

template<int L, int... Le, int R, int... Ri>
struct merge<arr<L, Le...>, arr<R, Ri...>> {
    using type = prepend_t<
        L < R ? L : R,
        merge_t<
            std::conditional_t<L < R, arr<Le...>, arr<L, Le...>>,
            std::conditional_t<L < R, arr<R, Ri...>, arr<Ri...>>
        >
    >;
};

The split is O(n) per level but there are only log(n) levels, so overall O(n log n). Fine.

Then I tried heapsort

Heapsort needs a heap. A heap needs parent/child relationships. Parent of node i is at i/2. Left child is 2i. Right child is 2i+1.

All index arithmetic. All O(1) in a real array.

But remember that get operation from earlier?

template<int A0, int... Args, int N>
struct get<arr<A0, Args...>, N> {
    static constexpr int value = get<arr<Args...>, N-1>::value;
};

That's O(n). Every single element access peels the head off one at a time. There's no jumping to position i.

So sift_down goes from O(log n) to O(n log n). Building the heap goes from O(n) to O(n² log n). The whole sort becomes a mess.

What I ended up writing was basically: scan the entire list for the minimum, remove it, prepend it to the recursively sorted rest. That's selection sort. O(n²).

template<int V0, int V1, int... Vs>
struct heapsort<arr<V0, V1, Vs...>> {
    using input = arr<V0, V1, Vs...>;
    static constexpr int min_val = min_element_v<input>;
    using remaining = remove_first_t<input, min_val>;
    using sorted_rest = heapsort_t<remaining>;
    using type = prepend_t<min_val, sorted_rest>;
};

Not really heapsort anymore. I kept the name because I started with the intention of writing one.

Why quicksort and mergesort survive

Neither of them need random access.

Quicksort works by filtering. You walk the list once, every element either passes the predicate or doesn't. That's a linear scan, which typelists handle fine because you're just peeling the head and recursing on the tail.

Mergesort works by splitting at a fixed position and merging two sorted lists by comparing heads. Also just head-peeling.

Heapsort is the odd one out because the algorithm is designed around a specific data structure (contiguous array with O(1) indexing). Take that away and the algorithmic complexity changes.

The part I didn't expect

I kind of assumed algorithm complexity was a standalone property. Like, quicksort is O(n log n), period. But it's not that simple. Quicksort is O(n log n) given that partition is O(n), which it is in both arrays and linked lists. Heapsort is O(n log n) given O(1) random access, which arrays have and typelists don't.

Made me realise complexity depends on the container too, not just the algorithm.

Probably obvious to anyone who's thought about this for more than five minutes, but I genuinely didn't clock it until I was staring at the heapsort implementation wondering why my compile times were blowing up.

Full code

The quicksort lives in namespace quicksort. The mergesort lives in namespace www because I was writing these in separate files and never renamed it. The heapsort I'm not posting because it's just selection sort wearing a trench coat.

Full source for quicksort and mergesort: [Godbolt link placeholder]

Part of NexusFix, an open-source FIX protocol engine in C++23.

DEV Community