The Javascript sort is slow and stupid

#programming #javascript #picture

Original post at https://siderite.dev/blog/javascript-sort-slow-and-stupid

I was looking into the concept of partial sorting, something that would help in a scenario where you want the k smaller or bigger items from an array of n items and k is significantly smaller than n. Since I am tinkering with LInQer, my implementation for LINQ methods in Javascript, I wanted to tackle the OrderBy(...).Take(k) situation. Anyway, doing that I found out some interesting things.

First, the default Javascript array .sort function has different implementations depending on browser and by that I mean different sort algorithms. Chrome uses insertion sort and Firefox uses merge sort. None of them is Quicksort, the one that would function best when the number of items is large.

I've implemented a custom function to do Quicksort and after about 30000 items it becomes faster than the default one. For a million items the default sort was almost three times slower than the Quicksort implementation. To be fair, I only tested this on Chrome. I have suspicions that the merge sort implementation might be better.

I was reporting in a previous version of this post that QuickSort, implemented in Javascript, was faster than the default .sort function, but it seems this was only an artifact of the unit tests I was using. Afterwards, I've found an optimized version of quicksort and it performed three times faster on ten million integers. I therefore conclude that the default sort implementation leaves a lot to be desired.

Second, the .sort function is by default alphanumeric. Try it: [1,2,10].sort() will return [1,10,2]. In order to do it numerically you need to hack away at it with [1,2,10].sort((i1,i2)=>i1-i2). In order to sort the array based on the type of item, you need to do something like: [1,2,10].sort((i1,i2)=>i1>i2?1:(i1<i2?-1:0)).

And coming back to the partial sorting, you can't do that with the default implementation, but you can with Quicksort. Simply do not sort any partition that is above and below the indexes you need to get the items from. The increases in time are amazing!

There is a difference between the browser implemented sort algorithms and QuickSort: they are stable. QuickSort does not guarantee the order of items with equal sort keys. Here is an example: [1,3,2,4,5,0].sort(i=>i%2) results in [2,4,0,1,3,5] (first even numbers, then odd, but in the same order as the original array). The QuickSort order depends on the choice of the pivot in the partition function, but assuming a clean down the middle index, the result is [4,2,0,5,1,3]. Note that in both cases the requirement has been met and the even numbers come first.

Top comments (4)

Theofanis Despoudis • Jan 20 '20

Nice. If you have only integers for keys you may consider Radixsort however it's not stable.

Or you could check the size of the array and if it's less that 42 you could leave the default sort but it it's more that 42 you could use quicksort.

Or you could just use MergeSort if you can live with the O(n) space complexity.

Costin Manda • Jan 20 '20

Radix sort has a lot of issues and is less than flexible. I've briefly considered Heap sort, but as in the case of Merge sort, there is the question of a memory complexity that increases with the count. Since I was looking for optimizations on the million item level, I've decided to go with the one that does in-place sorting.

I thought of the idea of using Quick and default based on some measurement, but in my case I didn't know what the size of the iterable was and even if I did, I am losing performance with small counts and gaining performance on large counts. Seems to me I can safely ignore small count performance anyway.

What I liked about Quick sort is that I can do .skip, .take, .skipLast and .takeLast operations on an ordered enumerable only when I am actually iterating the result. In that case, I need to perform a Quick sort only to take items from a start to an end index. This allows me to ignore sorting partitions outside that range. I don't know if it would have worked just as well with Merge sort and I didn't see the point of using Merge sort and compete with the Merge sort implementation in Firefox and Safari.

Theofanis Despoudis • Jan 20 '20 • Edited

It looks like there are not a lot of algorithms for partial sorting. But you could see the C++ implementation that uses Heapsort

github.com/gcc-mirror/gcc/blob/d93...

Costin Manda • Jan 20 '20

Sorry, guys! It seems I've spoken too soon. The speed of QuickSort was an artifact of my test data. Partial sorting still needs a custom implementation of Quicksort and that is what I am using in LInQer for cases when orderBy is followed by .take or .skip, but the internal implementation seems to be fast indeed. I still think it could be using a better algorithm, but, as expected really, the C++ implementation of the engine should not be slower than a Javascript one.