Generalizing Dijkstra's Dutch National Flag Algorithm
Dijkstra's Dutch National Flag (DNF) algorithm is one of those deceptively simple ideas that becomes more interesting the deeper you look into it.
At first glance, it solves a very narrow problem: sorting an array containing only 0, 1, and 2 in a single linear pass. Elegant, clever, and usually treated as a specialized trick.
But hidden inside DNF is a much more general idea.
The algorithm works because it understands the structure of the value space. It knows exactly which values belong on the left, which belong on the right, and which belong in the middle. Once you stop thinking of DNF as "sorting three values" and start thinking of it as "partitioning a known value space," the natural question becomes:
Can this idea be generalized to arbitrary integer arrays?
This article explores two algorithms built from that idea:
- SweepSort — a direct generalization of DNF
- SweepSort2 — a recursive value-space partitioning algorithm with linear worst-case behavior for fixed-width integers
Revisiting the Dutch National Flag Algorithm
The original DNF algorithm maintains three regions:
- low region (
0) - middle region (
1) - high region (
2)
and sweeps through the array once, routing elements into their correct regions using swaps.
The critical insight is not the swaps themselves.
The real insight is this:
DNF succeeds because the value space is completely known in advance.
The algorithm always knows:
- the smallest possible value,
- the largest possible value,
- and the bounded middle region.
That structure is what makes single-pass partitioning possible.
Now imagine replacing the hardcoded values 0, 1, and 2 with dynamically discovered boundaries.
That leads directly to the first algorithm.
SweepSort: A Direct Generalization of DNF
SweepSort extends DNF to arbitrary integer arrays.
Instead of partitioning around fixed values, it repeatedly:
- finds the current minimum and maximum,
- sweeps them to the ends,
- and discovers the next minimum and maximum during the same pass.
void SweepSort(int* array, size_t n) {
int max = INT_MIN, min = INT_MAX;
for (size_t i = 0; i < n; i++) {
max = Max(max, array[i]);
min = Min(min, array[i]);
}
size_t low = 0, mid = 0, high = n;
while (mid < high) {
int new_min = INT_MAX;
int new_max = INT_MIN;
while (mid < high) {
if (array[mid] == max) {
--high;
Swap(array + mid, array + high);
}
else if (array[mid] == min) {
Swap(array + mid, array + low);
++low, ++mid;
}
else {
new_min = Min(new_min, array[mid]);
new_max = Max(new_max, array[mid]);
++mid;
}
}
min = new_min;
max = new_max;
mid = low;
}
}
How SweepSort Works
Each sweep partitions the current unsorted region into three parts:
- elements equal to
min - elements strictly between
minandmax - elements equal to
max
The process mirrors classic DNF almost exactly.
During each pass:
-
minvalues are swapped left, -
maxvalues are swapped right, - everything else remains in the middle.
At the same time, the algorithm computes the next new_min and new_max from the remaining middle region for free during the partition pass itself.
That detail is crucial.
Without it, each recursion level would require an additional scan, dramatically worsening performance.
Core Invariant
After every sweep:
-
[0, low)contains finalized minimum values, -
[high, n)contains finalized maximum values, -
[low, high)contains only values strictly between the next minimum and maximum.
The middle region shrinks until no unsorted values remain.
Complexity Analysis
Each pass removes at least two distinct values from the unsorted region:
- the current minimum,
- and the current maximum.
If the array contains k distinct values: O(k * n)
Best Case
If the number of distinct values is small (i.e. the original DNF case): O(n)
Worst Case
If all values are unique: O(n^2)
which places SweepSort in quadratic territory for adversarial inputs.
SweepSort is therefore more interesting conceptually than practically — it exposes the structure hidden inside DNF, but does not fully solve the scalability problem.
That leads to the second algorithm.
SweepSort2: Recursive Value-Space Partitioning
SweepSort2 changes perspective entirely.
Instead of repeatedly removing extrema, it recursively partitions the value space itself.
static void SweepSort2Core(int* array, size_t n, int min, int max) {
if (min >= max) return;
size_t low = 0, high = 0;
int high_min = INT_MAX;
int high_max = INT_MIN;
int low_min = INT_MAX;
int low_max = INT_MIN;
int pivot = (max >= 0 && min < 0) ? -1 : min + (max - min) / 2;
while (high < n) {
if (array[high] > pivot) {
high_min = Min(high_min, array[high]);
high_max = Max(high_max, array[high]);
}
else {
low_min = Min(low_min, array[high]);
low_max = Max(low_max, array[high]);
Swap(array + low, array + high);
++low;
}
++high;
}
SweepSort2Core(array + low, n - low, high_min, high_max);
SweepSort2Core(array, low, low_min, low_max);
}
void SweepSort2(int* array, size_t n) {
SweepSort2Core(array, n, INT_MIN, INT_MAX);
}
How SweepSort2 Works
Each recursive call partitions the array into two regions:
-
low region: values
<= pivot -
high region: values
> pivot
This resembles quicksort superficially, but the recursion is driven by the value range, not the data distribution. The algorithm also computes the minimum and the maximum of the high and the low region during the partition pass itself.
No additional scans are required.
The Overflow Problem
This is the subtlest part of the algorithm.
The natural midpoint formula:
(min + max) / 2
can overflow when min + max exceeds the representable integer range.
The standard fix is usually written as:
min + (max - min) / 2
but that still fails when the range crosses zero.
For example:
min = INT_MIN
max = INT_MAX
Here:
max - min
itself overflows.
The problem is structural rather than arithmetic.
As long as recursion ranges are allowed to straddle zero, subtraction overflow remains possible.
The solution is to eliminate cross-zero ranges entirely:
int pivot = (max >= 0 && min < 0) ? -1 : min + (max - min) / 2;
When the range straddles zero, the pivot becomes -1, partitioning the value space into:
- negative integers,
- and non-negative integers.
After this split:
- negative-only ranges cannot overflow subtraction,
- non-negative-only ranges cannot overflow subtraction.
The overflow hazard disappears permanently after the first cross-zero partition.
This is important because the fix is not merely arithmetic — it restructures the recursion tree itself.
Complexity Analysis
The recursion depth is bounded by how many times the value range can be halved.
For a value range of width R: O(n * log2(R))
For fixed-width 32-bit integers: log2(R) <= 32
giving: O(32 * n)
which simplifies to linear time under the fixed-word-size model.
Unlike comparison sorts, the recursion depth depends on the value-space width, not the number of elements.
This is the central idea behind SweepSort2.
Relationship to Radix Sorting
SweepSort2 is closely related to MSD binary radix sort.
Both algorithms:
- recursively partition integers,
- achieve bounded recursion depth,
- and avoid comparison-sort lower bounds.
The key difference is that radix sort partitions explicit bits, while SweepSort2 partitions adaptive numeric intervals.
This makes SweepSort2 conceptually closer to recursive geometric partitioning of the integer space.
Comparison Summary
| Property | SweepSort | SweepSort2 |
|---|---|---|
| Core idea | Repeated extrema sweeping | Recursive value-space partitioning |
| Best case | O(n) |
O(n) |
| Worst case | O(n^2) |
O(n * log2(R)) |
| 32-bit integer worst case | O(n^2) |
O(32 * n) |
| Recursion depth |
O(k) distinct values |
O(log2(R)) |
| Stable | No | No |
| Extra memory | O(1) |
recursion stack |
| Closest analog | Generalized DNF | MSD radix / American Flag Sort |
Final Thoughts
SweepSort is primarily interesting because it exposes the hidden abstraction inside the Dutch National Flag algorithm.
SweepSort2 is where that abstraction becomes genuinely powerful.
The key realization is this:
Once recursion depth is bounded by value-space width rather than element count, linear worst-case behavior becomes possible for fixed-width integer data.
DNF turns out not to be a special-case trick for three values.
It is a concrete instance of a broader idea:
sorting by recursively partitioning value space itself.
That perspective was the real discovery behind these algorithms.
Top comments (0)