loading...
Cover image for Sleep Sort: Where Theory meets Sobering Reality

Sleep Sort: Where Theory meets Sobering Reality

sishaarrao profile image Sishaar Rao ・5 min read

Circa 6 years ago, an anonymous user posted on 4Chan:

A screenshot of the original post on 4Chan

He outlined this amazing undiscovered sort written in Shell which, sure enough, worked. Check out my repo to download the code and try it yourself!

A screenshot of me running the code

I'll admit, I was bewildered at first. I could envision how this algorithm could be refined and implemented, whether it be cutting the time needed to output the value or scaling datasets accordingly. But then I thought to myself if this sort (which off the top of my heads looks like an O(n) runtime) really works, then why hasn't some genius at IBM/Sun/Oracle implemented this already? I couldn't be the first one wondering how to implement this. Why did people before me not go ahead with it? Why don't we use this in real life?

Because at the end of the day, computers are still machines. At the end of the day, Theory is Theory, and Reality will always still be Reality

How/Why does Sleep-Sort Work?

The code for Sleep-Sort is as follows:

#!/usr/bin/env bash

sort() {
    sleep "$1"
    echo "$1"
}
while [ -n "$1" ]
do
    sort "$1" &
    shift
done
wait

Let's dissect this part by part. (In case you were wondering, #!/usr/bin/env bash is called the Shebang.)

The function

sort() {
    sleep "$1"
    echo "$1"
}

This function is going to be called repeatedly by the code. It takes its first argument (denoted by "$1") and sleeps for that amount of time. Sleep is a command that pauses its current process (or more importantly its thread) for the given amount of seconds. For example, sleep 100 would mean that the process "sleeps" or pauses for 100 seconds.
It then calls echo on the first argument, which prints out the argument. Thus, the code sort 3 would first sleep for 3 seconds and then print out 3.

The while-loop

while [ -n "$1" ]
do
    sort "$1" &
    shift
done
wait

The while-loop's condition is [ -n "$1" ] As you can observe, we use the -n flag with the first argument. This flag returns true if the argument is nonzero. Thus, this loop will iterate until the argument at $1 is null.

We then call sort on the first argument sort "$1" The most important part of this, however, is the ampersand (&) at the end What this ampersand at the end does is begin the process described in the line, namely the sort function call, in a different thread. Check this out for more info.

On the next line, we call the built-in function shift. Since there are no parameters for the function, the default value to shift is 1. What shift does is it shifts the arguments we have over. The argument in $2 goes to $1, the argument in $3 goes to $2 and so on. The original first argument is subsequently lost.

Running the code

A sample run of the Sleep Sort

As you now from the code above, we take the first argument and instantiate a separate thread that's sleeping for that argument length. Once it awakens, it prints out it's value. The 1 sleeps for 1 second, the 2 sleeps for 2 seconds, the 5 sleeps for 5, etc. Since the lower values sleep less, they print out their values earlier, and thus the list is now sorted. Isn't it amazing?

Why Sleep-Sort Fails

Intuition told me that there's some fundamental flaw in this sort, and that for large datasets it breaks down. The success of sleep-sort relies on 3 fundamental assumptions, which often don't hold true as we go beyond a set of 5 or 10 numbers.

1. The Processor can instantly determine which sub-process is the "least"

To generalize, the processor inherently possesses some data structure in which it stores sub-processes. It could be a Tree, a Heap, or some other data structure that allows the Processor to efficiently traverse its elements and find which one has the "least" value. By finding the least value, we mean finding which process should be executed next.

On the scale of milliseconds, sleep-sort can break down here. If we're given values 3 and 3.00001, the processor may determine the wrong process as the "least" value, simply because it takes the processor time, no matter how small, to find the correct process.

2. How Many Threads does the Computer Have?

Sleep-Sort depends on being able to separate each value into a different thread, and therefore it assumes that for n elements, you subsequently have n available threads. When you go into element sets of higher orders of magnitude, this assumption simply won't hold. Your computer has a finite amount of threads, and if you exceed that, then Sleep-Sort will fail.

3. The Processor can rapidly switch between processes

Think of your processor as a miniscule ticker. When it executes multiple tasks in parallel, in reality our metaphorical ticker is rapidly switching back and forth between our tasks. It executes Task A for a millisecond, then Task B for a millisecond, and then back to Task A. It does this switching on such a small scale that, to the human eye, it seems that the two are running in parallel.

When you have a high number of tasks, this ticker begins to break down. As pointed in Issue #1, it struggles to determine which task to approach next, and it also may get bogged down by the task it is currently at. If it loses its efficiency to switch back and forth between tasks, Sleep-sort begins to break down as the tasks won't be perfectly executed in order.

What does Sleep-Sort Teach Us

Scalability is an issue every software engineer has to keep in the back of their mind when they build a system. Often, the best products are the ones who consider scalability from the very start.

Scalability leads to success

What Sleep-Sort offered me, and what I hope it gave you, is a sobering reminder that what we believe works in a theoretical world may not work in a physical one. This statement seems obvious, but many create products that assume ideal situations and fail to account for scenarios where certain assumptions don't hold true.

We need to take a step back and understand to the core how our system works. Furthermore, when we scale our solutions up for larger or different data sets, we need to understand how functionality is affected when we do so. The only way to gain this insight is to acknowledge the limitations of your environment, and the limitations of your machine.

At the end of the day, your machine is still comprised of moving mechanical parts. Design your systems accordingly.

I hope this article was helpful in understanding Sleep-Sort. Check out my Sleep-Sort repo if you'd like to try/explore the sort for yourself! Above all, if you feel like I made any factual errors or if there's anything I can clarify/expand/modify to improve this post, please comment or reach out to me!

Posted on Oct 28 '17 by:

sishaarrao profile

Sishaar Rao

@sishaarrao

Hello! My name is Sishaar Rao, and I'm currently at UC Berkeley studying EECS! I love watching movies, eating Taco Bell, and completing 20% of an idea for a product before moving on to something new!

Discussion

markdown guide
 

This article started as a joke or funny idea, but it ended up as a serious analysis of why we should pay attention into taking the right path when developing.
I liked it a lot, bravo!

 

Forgive me if I'm wrong, but doesn't the sleep() command effectively take log(n) time? The OS needs a way to determine which process to run next, and presumably that's stored in some sort of priority queue, otherwise each context switch is an O(n) operation. For the sake of simplicity, we can assume the priority queue is just a sorted list where the first conversation is the one with the shortest sleep time. Inserting a new item into a sorted list is a binary search that takes O(log(n)) time. Given that assumption, sleep sort is just another O(n log n) sorting algorithm.

 

As a thought experiment, pretend the sleep command's argument isn't measured in milliseconds but rather in trillions of years. To sort a list of numbers from 1 to 100, you'd need 100 trillion years. But, after the initial list is processed, the sorting is effectively done -- no need to wait even 1 trillion years. Also, to simplify things, let's pretend the system is single-threaded (after all, the actual number of threads is a constant and so gets removed from big-O notation anyway).

There are two possibilities:
1) The sleeping processes are stored in a sorted list. After each trillion years, the CPU pulls the next process from the front of the list. This means that printing every item is only an O(n) operation. Magical! But wait, the list we started with was already sorted (even after 0 trillion years), so actually we're not using sleep sort at all, but rather insertion sort into a queue of processes. This sorting algorithm is O(n log n)
2) The sleeping processes are stored in memory in the order they were added. After each trillion years, the CPU loops through every process to find the next one to execute. Each one is an O(n) operation and it will happen n times in 100 trillion years. This sorting algorithm is O(n ^ 2)

(Sorry I got pretty excited thinking about this)

 

Hey Keaton! This sort of discussion is the same I was having with my peers, where we were debating how exactly you would go about determining Big O for such a sort, and we concluded that we'd need to know a lot more about the system that such a sort is running on. This essentially leads to the crux of my argument in this post, which was that we could theoretically state how functionality is affected with scale, but that reality is much different from theory.

The way I approached a Big O argument for such a sort was that the sleep() call (in UNIX at least) places a Non-Runnable flag on the thread, meaning that the task scheduler would more or less gloss over it when allocating time slices. My assumption was that when you called the sort() function, it would create one consolidated thread where that function is executed, first with the sleep() command followed by the echo command. Therefore, for n elements, there are n threads.

These approach is flawed because we don't know enough about the system's task scheduler, namely how does it store and traverse its threaded elements, and how does it sort these. My impression was that threads are not sorted by the contents of the code to be executed, but rather by the timeslices or some other inherent value that is determined without running the code. Chances are this is wrong.

Therefore, my answer really boils down to: your argument of Big O is as good as mine.

There's a fundamentally deeper level of understanding that I feel I need in order to properly assess what the runtime for such a sort is, but even after discussing with some of my teachers, they answer they gave was that Sleep-Sort is unique because it's an entirely different flavor of sort when compared to traditional sorts (Quicksort, Mergesort, etc.) As a result, using Big O analysis is not very helpful, because we've now widened the range of factors we have to consider. These range of factors are all physical characteristics (how are tasks sorted, how does the scheduler determine which one to allocate time to next, etc.), so it really doesn't help to analyze a theoretical system in this case.

He pointed me in the direction of Radix Sort and said that it's a good exploration of how Big O can get interesting when we use a theoretical concept to analyze physical models, and how we can find its shortcomings.

I hope my insight helps, and I'd definitely check out this thread, as there was some interesting discussion about Sleep-Sort's runtime!

Absolutely, this is a really interesting thought experiment. It's similar to the experiment of sorting unique numbers from 1 to N by using an array of N length and simply setting Array[number] = true for each value and reading off the true ones in order. This is, technically, O(n).

Of course, neither of these sorting algorithms is actually practical, but it definitely shows that O(n log n) is not the end-all be-all for sorting, at least not in all cases.

This is actually a really interesting point which, I find, many junior engineers gloss over when reciting lower bounds for sorting problems. It fully depends on the problem in question!

Your example of Array[number] = true (or the more simplified version of Array[number] = number) is can be kind of practical, even when you extend it to be sorting k unique numbers from 1 N. This "counting-sort" can be thought of as a special case of radix-sort, which itself can be thought of a special case for the class of probabilistic sorting algorithms (an example of which is bucket sort) where you know some information about the set being sorted.

Basically, the theorem which places the lower-bound on sorting algorithms to be O(nlogn) is based on the assumption that the set being sorted is any totally ordered set. The "any" part of that assumption means you cannot assume prior knowledge about the set. But, if that assumption is invalid (which it is most of the time - how often in practice do you know nothing about the data you are working with) you can produce a sorting algorithm which works better than O(nlogn).

 

Correct me but wouldn't this algorithm have o(x) time where x is the largest value in the list? If one of the values is 300,000, that item would be printed 83 hours later.

 

You are correct.

The algorithm is at best O(N) where N is the maximum sized number. However, generally algorithmic analysis is concerned with input size so its EXPONENTIAL with respect to the number of bits in N.

If you are able to modify the algorithm to just take the next highest number, then essentially what you have here is HeapSort developed in a very inefficient way, using OS threads to implement your heap.

So I'd say the algorithm is at least O(N log N) and at worst O(2b N logN) where b is the number of bits in the max number B that may show up in the array.

That is much worse than bubblesort for large B's.

 

I was thinking the same thing... O(max(n)), assuming that context switching is significantly smaller than min(n).

 

It feels like this would not always sort correctly if you account for processing time.

Example. If the list has 1 million items and the first is 0.2 and the last is 0.1. The processing time may be larger than the sleep time of the first item meaning the last item would be out of order. This is slightly different to the failure cases written above.

 

Once a dev had a problem "I'll solve it with threads", he thougth. problem Now that 2 has dev