Cover art: Brett Zeck on Unsplash
This blog post is about sorting things in Javascript, simple things like arrays or objects. This isn't about Big-O, complex algorithmic sorting, or anything more than we can test out in the terminal with node.
Why write about sorting
Sorting is one of those fundamental functions of front end design that is so ubiquitous that it's easy to overlook. I realized I wasn't sure how to best sort a simple set of data while the user waits, so I decided I'd make some practical notes on tackling simple sorting tasks in Javascript. Then I took it a step further and went down a rabbit hole, and here we both are. Welcome!
What else is out there?
Did you get here by way of search results? Nicely done! I doubt this will be on the first page because there are already excellent articles out there the aspects of .sort()
a good place to start - as always - is MDN.
Some other worthy reading if this is your first stop:
Why Localecompare can't deal with decimal numbers
Localecompare and speed issues - collator method
Did you leave to do some research and come back? I bet you did. Do you have a better idea of how the standard .sort()
works now?
localeCompare and the Intl.Collator
What is .localeCompare()
?
String.prototype.localeCompare() is a method which returns a number indicating whether a reference string comes before, after, or is the same as the given string in order. MDN.
The basic syntax is:
localeCompare(compareString)
localeCompare(compareString, locales)
localeCompare(compareString, locales, options)
What is the Intl.Collator?
The Intl.Collator object enables language sensitive string comparison. MDN
For the purposes of this article suffice to say .localeCompare()
can be your entry point to the world of the Intl.Collator
- there is cool stuff in there.
The collator allows for specific language and character set variations (locales). [see note 1 below]
What's that mean for sorting? Well, it lets us sort strings and take into account language and character set variations. Let's look at a couple examples.
Default Sorting
First, remember that the standard string sorting functions evaluate based on unicode values, and sort based on those. So - let's look at those too:
Char | Unicode |
---|---|
a | 0061 |
A | 0041 |
ä | 0228 |
n | 006E |
N | 004E |
ñ | 00F1 |
Don't forget! ... capitals and lower case letters also have different unicode values. Which means an uppercase A comes before a, which comes before ä.
What happens if we use array.prototype.sort() on these and sort in place?
arryA = [ 'a', 'A', 'ä' ]
//=> [ 'a', 'A', 'ä' ]
arryA.sort()
//=> [ 'A', 'a', 'ä' ]
arryN = [ 'n', 'N', 'ñ' ]
//=> [ 'n', 'N', 'ñ' ]
arryN.sort()
//=> [ 'N', 'n', 'ñ' ]
We can see it is simply organizing our characters by unicode value. What about making our sort a bit more...well travelled? A bit more...sensitive to different locales...
Basic alpha sort with .localeCompare()
The construction of .localeCompare()
is different than .sort()
because it is comparing a string against another string. Compared to .sort()
which sorts an array in place.
'a'.localeCompare('ä')
//=> -1
'a'.localeCompare('a')
//=> 0
'a'.localeCompare('A')
//=> 1
Without any options .localeCompare()
is doing the same as the basic sort. Let's add in some sensitivity options:
'a'.localeCompare('ä', undefined, {sensitivity: 'base'})
//=> 0
'a'.localeCompare('a', undefined, {sensitivity: 'base'})
//=> 0
'a'.localeCompare('A', undefined, {sensitivity: 'base'})
//=> 0
Let's look at each piece of the .localCompare()
call and talk about what is going on.
'string1'.localeCompare('string2', 'en', {sensitivity: 'base'} )
-
string1
andstring2
are our strings to compare -
'en'
is English, for the language set to use for our comparison -
{sensitivity: 'base'}
is the level of sensitivity that javascript will apply to the comparison.'base'
allows for letters of the same base to be evaluated equivalently, disregarding things like umlauts or capitalization - an A is an a is an ä (in this specific case at least). There are a few other sensitivity options, see all the options here.
Ok, so we're seeing that you can use .localeCompare()
to smooth out alphabetical sorting, but ... what about numbers?
Numbers are totally international!
Weirdly enough, trying to use .localeCompare()
for numeric sorting is what send me down this road in the first place. My initial research said it wasn't possible, but what I learned is: it works, and it's pretty cool! So, why the hubbub? Well, remember this is String.prototype.localeCompare()
meaning that it's really only wants to work on strings, not numbers. But, thanks to the right settings you can worry no more about having numbers in your strings (I'm thinking street addresses).
// top examples establish how the comparison works
"a".localeCompare("b")
//=> -1 // "a" comes before "b"
"1".localeCompare("2")
//=> -1 // "1" comes before "2"
"1".localeCompare("1")
//=> 0 // "1" is equal to "1"
"1".localeCompare("0")
//=> 1 // "2" comes before "1"
"1".localeCompare("01")
//=> 1 // "01" comes before "1" // huh, that's weird
// depending on your situation this might be ok, or problematic.
//
// Add in the numeric option
"1".localeCompare("01", undefined, {numeric: true})
//=> 0
"11".localeCompare("11", undefined, {numeric: true})
//=> 0
"11".localeCompare("011", undefined, {numeric: true})
//=> 0
Conclusion
Using .localeCompare()
for standard string comparison works nicely, and it even works if you're mixing numbers into your strings. I know that I'll be keeping these sorting options available to me if I'm working with anything with the possibility of international addresses!
The Intl.Collator
is outside of the scope of this article, but if you're working with data that needs to account for language variations I'd recommend checking it out!
my code coda
1) Thanks for reading, if I got something wrong let me know!
2) There are always things to improve - what could we do better here?
notes
1 The Intl.Collator yields great performance value over using localeCompare() on its own when working with large datasets - I'd urge you take a deep dive into it if you're working with large datasets.
2 - in German a and ä have the same base letter, the same goes for Spanish with n and ñ - which means they evaluate to the same value with sensitivity. In languages like Swedish which have differing base letters for ä and a they are evaluated separately.
Top comments (1)
Hi, I got here because I was trying to understand why .localeCompare seems to work different than .sort when it comes to uppercase vs lowercase.
Your example of 'a'.localeCompare('A') was supposed to give a result of 1, but actually it gives a -1.
Just wanted to say that.