I've implemented a custom function to do Quicksort and after about 30000 items it becomes faster than the default one. For a million items the default sort was almost three times slower than the Quicksort implementation. To be fair, I only tested this on Chrome. I have suspicions that the merge sort implementation might be better.
Second, the .sort function is by default alphanumeric. Try it: [1,2,10].sort() will return [1,10,2]. In order to do it numerically you need to hack away at it with [1,2,10].sort((i1,i2)=>i1-i2). In order to sort the array based on the type of item, you need to do something like: [1,2,10].sort((i1,i2)=>i1>i2?1:(i1<i2?-1:0)).
And coming back to the partial sorting, you can't do that with the default implementation, but you can with Quicksort. Simply do not sort any partition that is above and below the indexes you need to get the items from. The increases in time are amazing!
There is a difference between the browser implemented sort algorithms and QuickSort: they are stable. QuickSort does not guarantee the order of items with equal sort keys. Here is an example: [1,3,2,4,5,0].sort(i=>i%2) results in [2,4,0,1,3,5] (first even numbers, then odd, but in the same order as the original array). The QuickSort order depends on the choice of the pivot in the partition function, but assuming a clean down the middle index, the result is [4,2,0,5,1,3]. Note that in both cases the requirement has been met and the even numbers come first.
Top comments (4)
Nice. If you have only integers for keys you may consider Radixsort however it's not stable.
Or you could check the size of the array and if it's less that 42 you could leave the default sort but it it's more that 42 you could use quicksort.
Or you could just use MergeSort if you can live with the O(n) space complexity.
Radix sort has a lot of issues and is less than flexible. I've briefly considered Heap sort, but as in the case of Merge sort, there is the question of a memory complexity that increases with the count. Since I was looking for optimizations on the million item level, I've decided to go with the one that does in-place sorting.
I thought of the idea of using Quick and default based on some measurement, but in my case I didn't know what the size of the iterable was and even if I did, I am losing performance with small counts and gaining performance on large counts. Seems to me I can safely ignore small count performance anyway.
What I liked about Quick sort is that I can do .skip, .take, .skipLast and .takeLast operations on an ordered enumerable only when I am actually iterating the result. In that case, I need to perform a Quick sort only to take items from a start to an end index. This allows me to ignore sorting partitions outside that range. I don't know if it would have worked just as well with Merge sort and I didn't see the point of using Merge sort and compete with the Merge sort implementation in Firefox and Safari.
It looks like there are not a lot of algorithms for partial sorting. But you could see the C++ implementation that uses Heapsort