Sleep Sort: Where Theory meets Sobering Reality

Sishaar Rao on October 28, 2017

Circa 6 years ago, an anonymous user posted on 4Chan: He outlined this amazing undiscovered sort written in Shell which, sure enough, worked. ... [Read Full]
markdown guide
 

This article started as a joke or funny idea, but it ended up as a serious analysis of why we should pay attention into taking the right path when developing.
I liked it a lot, bravo!

 

Forgive me if I'm wrong, but doesn't the sleep() command effectively take log(n) time? The OS needs a way to determine which process to run next, and presumably that's stored in some sort of priority queue, otherwise each context switch is an O(n) operation. For the sake of simplicity, we can assume the priority queue is just a sorted list where the first conversation is the one with the shortest sleep time. Inserting a new item into a sorted list is a binary search that takes O(log(n)) time. Given that assumption, sleep sort is just another O(n log n) sorting algorithm.

 

As a thought experiment, pretend the sleep command's argument isn't measured in milliseconds but rather in trillions of years. To sort a list of numbers from 1 to 100, you'd need 100 trillion years. But, after the initial list is processed, the sorting is effectively done -- no need to wait even 1 trillion years. Also, to simplify things, let's pretend the system is single-threaded (after all, the actual number of threads is a constant and so gets removed from big-O notation anyway).

There are two possibilities:
1) The sleeping processes are stored in a sorted list. After each trillion years, the CPU pulls the next process from the front of the list. This means that printing every item is only an O(n) operation. Magical! But wait, the list we started with was already sorted (even after 0 trillion years), so actually we're not using sleep sort at all, but rather insertion sort into a queue of processes. This sorting algorithm is O(n log n)
2) The sleeping processes are stored in memory in the order they were added. After each trillion years, the CPU loops through every process to find the next one to execute. Each one is an O(n) operation and it will happen n times in 100 trillion years. This sorting algorithm is O(n ^ 2)

(Sorry I got pretty excited thinking about this)

 

Hey Keaton! This sort of discussion is the same I was having with my peers, where we were debating how exactly you would go about determining Big O for such a sort, and we concluded that we'd need to know a lot more about the system that such a sort is running on. This essentially leads to the crux of my argument in this post, which was that we could theoretically state how functionality is affected with scale, but that reality is much different from theory.

The way I approached a Big O argument for such a sort was that the sleep() call (in UNIX at least) places a Non-Runnable flag on the thread, meaning that the task scheduler would more or less gloss over it when allocating time slices. My assumption was that when you called the sort() function, it would create one consolidated thread where that function is executed, first with the sleep() command followed by the echo command. Therefore, for n elements, there are n threads.

These approach is flawed because we don't know enough about the system's task scheduler, namely how does it store and traverse its threaded elements, and how does it sort these. My impression was that threads are not sorted by the contents of the code to be executed, but rather by the timeslices or some other inherent value that is determined without running the code. Chances are this is wrong.

Therefore, my answer really boils down to: your argument of Big O is as good as mine.

There's a fundamentally deeper level of understanding that I feel I need in order to properly assess what the runtime for such a sort is, but even after discussing with some of my teachers, they answer they gave was that Sleep-Sort is unique because it's an entirely different flavor of sort when compared to traditional sorts (Quicksort, Mergesort, etc.) As a result, using Big O analysis is not very helpful, because we've now widened the range of factors we have to consider. These range of factors are all physical characteristics (how are tasks sorted, how does the scheduler determine which one to allocate time to next, etc.), so it really doesn't help to analyze a theoretical system in this case.

He pointed me in the direction of Radix Sort and said that it's a good exploration of how Big O can get interesting when we use a theoretical concept to analyze physical models, and how we can find its shortcomings.

I hope my insight helps, and I'd definitely check out this thread, as there was some interesting discussion about Sleep-Sort's runtime!

Absolutely, this is a really interesting thought experiment. It's similar to the experiment of sorting unique numbers from 1 to N by using an array of N length and simply setting Array[number] = true for each value and reading off the true ones in order. This is, technically, O(n).

Of course, neither of these sorting algorithms is actually practical, but it definitely shows that O(n log n) is not the end-all be-all for sorting, at least not in all cases.

This is actually a really interesting point which, I find, many junior engineers gloss over when reciting lower bounds for sorting problems. It fully depends on the problem in question!

Your example of Array[number] = true (or the more simplified version of Array[number] = number) is can be kind of practical, even when you extend it to be sorting k unique numbers from 1 N. This "counting-sort" can be thought of as a special case of radix-sort, which itself can be thought of a special case for the class of probabilistic sorting algorithms (an example of which is bucket sort) where you know some information about the set being sorted.

Basically, the theorem which places the lower-bound on sorting algorithms to be O(nlogn) is based on the assumption that the set being sorted is any totally ordered set. The "any" part of that assumption means you cannot assume prior knowledge about the set. But, if that assumption is invalid (which it is most of the time - how often in practice do you know nothing about the data you are working with) you can produce a sorting algorithm which works better than O(nlogn).

 

Correct me but wouldn't this algorithm have o(x) time where x is the largest value in the list? If one of the values is 300,000, that item would be printed 83 hours later.

 

You are correct.

The algorithm is at best O(N) where N is the maximum sized number. However, generally algorithmic analysis is concerned with input size so its EXPONENTIAL with respect to the number of bits in N.

If you are able to modify the algorithm to just take the next highest number, then essentially what you have here is HeapSort developed in a very inefficient way, using OS threads to implement your heap.

So I'd say the algorithm is at least O(N log N) and at worst O(2b N logN) where b is the number of bits in the max number B that may show up in the array.

That is much worse than bubblesort for large B's.

 

I was thinking the same thing... O(max(n)), assuming that context switching is significantly smaller than min(n).

 

It feels like this would not always sort correctly if you account for processing time.

Example. If the list has 1 million items and the first is 0.2 and the last is 0.1. The processing time may be larger than the sleep time of the first item meaning the last item would be out of order. This is slightly different to the failure cases written above.

 

Once a dev had a problem "I'll solve it with threads", he thougth. problem Now that 2 has dev

code of conduct - report abuse