As a thought experiment, pretend the sleep command's argument isn't measured in milliseconds but rather in trillions of years. To sort a list of numbers from 1 to 100, you'd need 100 trillion years. But, after the initial list is processed, the sorting is effectively done -- no need to wait even 1 trillion years. Also, to simplify things, let's pretend the system is single-threaded (after all, the actual number of threads is a constant and so gets removed from big-O notation anyway).

There are two possibilities:
1) The sleeping processes are stored in a sorted list. After each trillion years, the CPU pulls the next process from the front of the list. This means that printing every item is only an O(n) operation. Magical! But wait, the list we started with was already sorted (even after 0 trillion years), so actually we're not using sleep sort at all, but rather insertion sort into a queue of processes. This sorting algorithm is O(n log n)
2) The sleeping processes are stored in memory in the order they were added. After each trillion years, the CPU loops through every process to find the next one to execute. Each one is an O(n) operation and it will happen n times in 100 trillion years. This sorting algorithm is O(n ^ 2)

Hey Keaton! This sort of discussion is the same I was having with my peers, where we were debating how exactly you would go about determining Big O for such a sort, and we concluded that we'd need to know a lot more about the system that such a sort is running on. This essentially leads to the crux of my argument in this post, which was that we could theoretically state how functionality is affected with scale, but that reality is much different from theory.

The way I approached a Big O argument for such a sort was that the sleep() call (in UNIX at least) places a Non-Runnable flag on the thread, meaning that the task scheduler would more or less gloss over it when allocating time slices. My assumption was that when you called the sort() function, it would create one consolidated thread where that function is executed, first with the sleep() command followed by the echo command. Therefore, for n elements, there are n threads.

These approach is flawed because we don't know enough about the system's task scheduler, namely how does it store and traverse its threaded elements, and how does it sort these. My impression was that threads are not sorted by the contents of the code to be executed, but rather by the timeslices or some other inherent value that is determined without running the code. Chances are this is wrong.

Therefore, my answer really boils down to: your argument of Big O is as good as mine.

There's a fundamentally deeper level of understanding that I feel I need in order to properly assess what the runtime for such a sort is, but even after discussing with some of my teachers, they answer they gave was that Sleep-Sort is unique because it's an entirely different flavor of sort when compared to traditional sorts (Quicksort, Mergesort, etc.) As a result, using Big O analysis is not very helpful, because we've now widened the range of factors we have to consider. These range of factors are all physical characteristics (how are tasks sorted, how does the scheduler determine which one to allocate time to next, etc.), so it really doesn't help to analyze a theoretical system in this case.

He pointed me in the direction of Radix Sort and said that it's a good exploration of how Big O can get interesting when we use a theoretical concept to analyze physical models, and how we can find its shortcomings.

I hope my insight helps, and I'd definitely check out this thread, as there was some interesting discussion about Sleep-Sort's runtime!

Absolutely, this is a really interesting thought experiment. It's similar to the experiment of sorting unique numbers from 1 to N by using an array of N length and simply setting Array[number] = true for each value and reading off the true ones in order. This is, technically, O(n).

Of course, neither of these sorting algorithms is actually practical, but it definitely shows that O(n log n) is not the end-all be-all for sorting, at least not in all cases.

This is actually a really interesting point which, I find, many junior engineers gloss over when reciting lower bounds for sorting problems. It fully depends on the problem in question!

Your example of Array[number] = true (or the more simplified version of Array[number] = number) is can be kind of practical, even when you extend it to be sorting k unique numbers from 1 N. This "counting-sort" can be thought of as a special case of radix-sort, which itself can be thought of a special case for the class of probabilistic sorting algorithms (an example of which is bucket sort) where you know some information about the set being sorted.

Basically, the theorem which places the lower-bound on sorting algorithms to be O(nlogn) is based on the assumption that the set being sorted is any totally ordered set. The "any" part of that assumption means you cannot assume prior knowledge about the set. But, if that assumption is invalid (which it is most of the time - how often in practice do you know nothing about the data you are working with) you can produce a sorting algorithm which works better than O(nlogn).

## re: Sleep Sort: Where Theory meets Sobering Reality VIEW POST

TOP OF THREAD FULL DISCUSSIONAs a thought experiment, pretend the sleep command's argument isn't measured in milliseconds but rather in trillions of years. To sort a list of numbers from 1 to 100, you'd need 100 trillion years. But, after the initial list is processed, the sorting is effectively done -- no need to wait even 1 trillion years. Also, to simplify things, let's pretend the system is single-threaded (after all, the actual number of threads is a constant and so gets removed from big-O notation anyway).

There are two possibilities:

1) The sleeping processes are stored in a sorted list. After each trillion years, the CPU pulls the next process from the front of the list. This means that printing every item is only an O(n) operation. Magical! But wait, the list we started with was already sorted (even after 0 trillion years), so actually we're not using sleep sort at all, but rather insertion sort into a queue of processes.

This sorting algorithm is O(n log n)2) The sleeping processes are stored in memory in the order they were added. After each trillion years, the CPU loops through every process to find the next one to execute. Each one is an O(n) operation and it will happen n times in 100 trillion years.

This sorting algorithm is O(n ^ 2)(Sorry I got pretty excited thinking about this)

Hey Keaton! This sort of discussion is the same I was having with my peers, where we were debating how exactly you would go about determining Big O for such a sort, and we concluded that we'd need to know a lot more about the system that such a sort is running on. This essentially leads to the crux of my argument in this post, which was that we could theoretically state how functionality is affected with scale, but that reality is much different from theory.

The way I approached a Big O argument for such a sort was that the sleep() call (in UNIX at least) places a Non-Runnable flag on the thread, meaning that the task scheduler would more or less gloss over it when allocating time slices. My assumption was that when you called the sort() function, it would create one consolidated thread where that function is executed, first with the sleep() command followed by the echo command. Therefore, for n elements, there are n threads.

These approach is flawed because we don't know enough about the system's task scheduler, namely how does it store and traverse its threaded elements, and how does it sort these. My impression was that threads are not sorted by the contents of the code to be executed, but rather by the timeslices or some other inherent value that is determined without running the code. Chances are this is wrong.

Therefore, my answer really boils down to:

your argument of Big O is as good as mine.There's a fundamentally deeper level of understanding that I feel I need in order to properly assess what the runtime for such a sort is, but even after discussing with some of my teachers, they answer they gave was that Sleep-Sort is unique because it's an entirely different flavor of sort when compared to traditional sorts (Quicksort, Mergesort, etc.) As a result, using Big O analysis is not very helpful, because we've now widened the range of factors we have to consider. These range of factors are all physical characteristics (how are tasks sorted, how does the scheduler determine which one to allocate time to next, etc.), so it really doesn't help to analyze a theoretical system in this case.

He pointed me in the direction of Radix Sort and said that it's a good exploration of how Big O can get interesting when we use a theoretical concept to analyze physical models, and how we can find its shortcomings.

I hope my insight helps, and I'd definitely check out this thread, as there was some interesting discussion about Sleep-Sort's runtime!

Absolutely, this is a really interesting thought experiment. It's similar to the experiment of sorting unique numbers from 1 to N by using an array of N length and simply setting Array[number] = true for each value and reading off the true ones in order. This is, technically, O(n).

Of course, neither of these sorting algorithms is actually practical, but it definitely shows that O(n log n) is not the end-all be-all for sorting, at least not in all cases.

This is actually a really interesting point which, I find, many junior engineers gloss over when reciting lower bounds for sorting problems. It fully depends on the problem in question!

Your example of Array[number] = true (or the more simplified version of Array[number] = number) is can be kind of practical, even when you extend it to be sorting k unique numbers from 1 N. This "counting-sort" can be thought of as a special case of radix-sort, which itself can be thought of a special case for the class of probabilistic sorting algorithms (an example of which is bucket sort) where you know some information about the set being sorted.

Basically, the theorem which places the lower-bound on sorting algorithms to be O(nlogn) is based on the assumption that the set being sorted is any totally ordered set. The "any" part of that assumption means you cannot assume prior knowledge about the set. But, if that assumption is invalid (which it is most of the time - how often in practice do you know nothing about the data you are working with) you can produce a sorting algorithm which works better than O(nlogn).