DEV Community

Haris Abdullah
Haris Abdullah

Posted on

Integer Sorting via Value-Space Partitioning

Generalizing Dijkstra's Dutch National Flag Algorithm

Dijkstra's Dutch National Flag (DNF) algorithm is one of those deceptively simple ideas that becomes more interesting the deeper you look into it.

At first glance, it solves a very narrow problem: sorting an array containing only 0, 1, and 2 in a single linear pass. Elegant, clever, and usually treated as a specialized trick.

But hidden inside DNF is a much more general idea.

The algorithm works because it understands the structure of the value space. It knows exactly which values belong on the left, which belong on the right, and which belong in the middle. Once you stop thinking of DNF as "sorting three values" and start thinking of it as "partitioning a known value space," the natural question becomes:

Can this idea be generalized to arbitrary integer arrays?

This article explores two algorithms built from that idea:

  • SweepSort — a direct generalization of DNF
  • SweepSort2 — a recursive value-space partitioning algorithm with linear worst-case behavior for fixed-width integers

Revisiting the Dutch National Flag Algorithm

The original DNF algorithm maintains three regions:

  • low region (0)
  • middle region (1)
  • high region (2)

and sweeps through the array once, routing elements into their correct regions using swaps.

The critical insight is not the swaps themselves.

The real insight is this:

DNF succeeds because the value space is completely known in advance.

The algorithm always knows:

  • the smallest possible value,
  • the largest possible value,
  • and the bounded middle region.

That structure is what makes single-pass partitioning possible.

Now imagine replacing the hardcoded values 0, 1, and 2 with dynamically discovered boundaries.

That leads directly to the first algorithm.


SweepSort: A Direct Generalization of DNF

SweepSort extends DNF to arbitrary integer arrays.

Instead of partitioning around fixed values, it repeatedly:

  1. finds the current minimum and maximum,
  2. sweeps them to the ends,
  3. and discovers the next minimum and maximum during the same pass.
void SweepSort(int* array, size_t n) {

    int max = INT_MIN, min = INT_MAX;

    for (size_t i = 0; i < n; i++) {

        max = Max(max, array[i]);
        min = Min(min, array[i]);
    }

    size_t low = 0, mid = 0, high = n;

    while (mid < high) {

        int new_min = INT_MAX;
        int new_max = INT_MIN;

        while (mid < high) {

            if (array[mid] == max) {

                --high;
                Swap(array + mid, array + high);
            }
            else if (array[mid] == min) {

                Swap(array + mid, array + low);
                ++low, ++mid;
            }
            else {

                new_min = Min(new_min, array[mid]);
                new_max = Max(new_max, array[mid]);

                ++mid;
            }
        }

        min = new_min;
        max = new_max;

        mid = low;
    }
}
Enter fullscreen mode Exit fullscreen mode

How SweepSort Works

Each sweep partitions the current unsorted region into three parts:

  • elements equal to min
  • elements strictly between min and max
  • elements equal to max

The process mirrors classic DNF almost exactly.

During each pass:

  • min values are swapped left,
  • max values are swapped right,
  • everything else remains in the middle.

At the same time, the algorithm computes the next new_min and new_max from the remaining middle region for free during the partition pass itself.

That detail is crucial.

Without it, each recursion level would require an additional scan, dramatically worsening performance.

Core Invariant

After every sweep:

  • [0, low) contains finalized minimum values,
  • [high, n) contains finalized maximum values,
  • [low, high) contains only values strictly between the next minimum and maximum.

The middle region shrinks until no unsorted values remain.


Complexity Analysis

Each pass removes at least two distinct values from the unsorted region:

  • the current minimum,
  • and the current maximum.

If the array contains k distinct values: O(k * n)

Best Case

If the number of distinct values is small (i.e. the original DNF case): O(n)

Worst Case

If all values are unique: O(n^2)

which places SweepSort in quadratic territory for adversarial inputs.

SweepSort is therefore more interesting conceptually than practically — it exposes the structure hidden inside DNF, but does not fully solve the scalability problem.

That leads to the second algorithm.


SweepSort2: Recursive Value-Space Partitioning

SweepSort2 changes perspective entirely.

Instead of repeatedly removing extrema, it recursively partitions the value space itself.

static void SweepSort2Core(int* array, size_t n, int min, int max) {

    if (min >= max) return;

    size_t low = 0, high = 0;

    int high_min = INT_MAX;
    int high_max = INT_MIN;

    int low_min = INT_MAX;
    int low_max = INT_MIN;

    int pivot = (max >= 0 && min < 0) ? -1 : min + (max - min) / 2;

    while (high < n) {

        if (array[high] > pivot) {

            high_min = Min(high_min, array[high]);
            high_max = Max(high_max, array[high]);
        }
        else {

            low_min = Min(low_min, array[high]);
            low_max = Max(low_max, array[high]);

            Swap(array + low, array + high);

            ++low;
        }

        ++high;
    }

    SweepSort2Core(array + low, n - low, high_min, high_max);
    SweepSort2Core(array, low, low_min, low_max);
}

void SweepSort2(int* array, size_t n) {
    SweepSort2Core(array, n, INT_MIN, INT_MAX);
}
Enter fullscreen mode Exit fullscreen mode

How SweepSort2 Works

Each recursive call partitions the array into two regions:

  • low region: values <= pivot
  • high region: values > pivot

This resembles quicksort superficially, but the recursion is driven by the value range, not the data distribution. The algorithm also computes the minimum and the maximum of the high and the low region during the partition pass itself.

No additional scans are required.


The Overflow Problem

This is the subtlest part of the algorithm.

The natural midpoint formula:

(min + max) / 2
Enter fullscreen mode Exit fullscreen mode

can overflow when min + max exceeds the representable integer range.

The standard fix is usually written as:

min + (max - min) / 2
Enter fullscreen mode Exit fullscreen mode

but that still fails when the range crosses zero.

For example:

min = INT_MIN
max = INT_MAX
Enter fullscreen mode Exit fullscreen mode

Here:

max - min
Enter fullscreen mode Exit fullscreen mode

itself overflows.

The problem is structural rather than arithmetic.

As long as recursion ranges are allowed to straddle zero, subtraction overflow remains possible.

The solution is to eliminate cross-zero ranges entirely:

int pivot = (max >= 0 && min < 0) ? -1 : min + (max - min) / 2;
Enter fullscreen mode Exit fullscreen mode

When the range straddles zero, the pivot becomes -1, partitioning the value space into:

  • negative integers,
  • and non-negative integers.

After this split:

  • negative-only ranges cannot overflow subtraction,
  • non-negative-only ranges cannot overflow subtraction.

The overflow hazard disappears permanently after the first cross-zero partition.

This is important because the fix is not merely arithmetic — it restructures the recursion tree itself.


Complexity Analysis

The recursion depth is bounded by how many times the value range can be halved.

For a value range of width R: O(n * log2(R))

For fixed-width 32-bit integers: log2(R) <= 32

giving: O(32 * n)

which simplifies to linear time under the fixed-word-size model.

Unlike comparison sorts, the recursion depth depends on the value-space width, not the number of elements.

This is the central idea behind SweepSort2.


Relationship to Radix Sorting

SweepSort2 is closely related to MSD binary radix sort.

Both algorithms:

  • recursively partition integers,
  • achieve bounded recursion depth,
  • and avoid comparison-sort lower bounds.

The key difference is that radix sort partitions explicit bits, while SweepSort2 partitions adaptive numeric intervals.

This makes SweepSort2 conceptually closer to recursive geometric partitioning of the integer space.


Comparison Summary

Property SweepSort SweepSort2
Core idea Repeated extrema sweeping Recursive value-space partitioning
Best case O(n) O(n)
Worst case O(n^2) O(n * log2(R))
32-bit integer worst case O(n^2) O(32 * n)
Recursion depth O(k) distinct values O(log2(R))
Stable No No
Extra memory O(1) recursion stack
Closest analog Generalized DNF MSD radix / American Flag Sort

Final Thoughts

SweepSort is primarily interesting because it exposes the hidden abstraction inside the Dutch National Flag algorithm.

SweepSort2 is where that abstraction becomes genuinely powerful.

The key realization is this:

Once recursion depth is bounded by value-space width rather than element count, linear worst-case behavior becomes possible for fixed-width integer data.

DNF turns out not to be a special-case trick for three values.

It is a concrete instance of a broader idea:

sorting by recursively partitioning value space itself.

That perspective was the real discovery behind these algorithms.

Top comments (0)