Vaidehi Joshi

Posted on Sep 23, 2017 • Originally published at Medium on Jul 24, 2017

Getting To The Root Of Sorting With Radix Sort

#algorithms #sortingalgorithms #computerscience #theoryinpractice

Today marks the very last sorting algorithm that we’re going to cover in this series. You are free to (finally) breathe a sigh of relief!

Sorting algorithms are the bread and butter (okay, maybe more like the kale) of computer science. They’re good for us to know, and we should chew on them every once in awhile, but they’re not always the most fun things to learn. Hopefully, this series has made them more enjoyable.

But as is the case with most things that are good for you, there’s a reason why sorting algorithms have left such an indelible mark on the history of computing. As we finish of this section on sorting, we’ll learn a little bit about the very first sorting algorithm: radix sort! We’re well-equipped to learn about this particular algorithm, and it seems like the perfect one to close with. What better way to understand the power, importance, and rich history of sorting algorithms than by learning about the one that started it all?

So, let’s get sortingâ€Š–â€Šone last time!

A rooting interest in the radix

Before we get into the inner workings of radix sort and how it works, let’s first understand what the word radix actually means. As it turns out, if you’ve been reading this series from the beginning, you already know what it means.

Months ago, when we were learning about bits, bytes, and building with binary, we learned about the concept of a “base”, which represents how many digits are possible in a single place value. We also discovered that different number systems have different bases.

Well, that’s exactly what a radix is! Radix is just another mathematical term for the base of a number. Based upon this logic, we can say that a hexadecimal number has a radix of 16â€Š–â€Šsimilar to how we’d say that the hexadecimal number system is a base-16 number system.

The term radix has an interesting backstory. It’s etymology comes directly from Latin; in Latin, radix translates directly to root, which makes sense when you think about the root of a number and a number system being derived directly from its radix, and the total number of possible digits for a number’s place values.

The word radix comes from the Latin word forÂ “root”.

Now, how does the term radix tie back into radix sort? Well, if you’re guessing that radix sort has something to do with the base or digits of a numberâ€Š–â€Šyou’re totally right.

The radix sorting algorithm is an integer sorting algorithm, that sorts by grouping numbers by their individual digits (or by their radix). It uses each radix/digit as a key, and implements counting sort or bucket sort under the hood in order to do the work of sorting.

If this sounds slightly complicated, don’t be dismayed! The super cool thing about this particular algorithm is that you’ve actually probably used it in some capacity in your everyday life already.

Let me show you how.

Let’s say that we have a bunch of words that we want to put in order. Because we’re dealing with words, putting them in order basically just means alphabetizing them! We can handle that, right? We’ve all had to alphabetize something or the other before.

In this example, we want to alphabetize a list of houseplants. I want to buy one for my apartment, and these are the houseplants that are on my shortlist.

How can we alphabetize this list of houseplants?

Okay, so we have six names that we want to alphabetize: Ficus, Agave, Fig, Palm, Aloe, and Fern. Let’s not overthink this too much, and just keep it simple. How would we start putting these names in order?

Well, we’d probably begin by grouping together the names with similar letters, and while we’re doing this, we can start the initial process of ordering.

In other words, we’ll look for all the names that start with A, then any that start with B, C, D, and so on. In our case, we don’t have that many names to sort, so we end up with three buckets: names that start with A, F, and P.

Cool, so we’ve partially sorted these names, but only by their first letters. We’ll notice that we’re not done sorting yetâ€Š–â€Šthe two names in the A bucket, Aloe and Agaveâ€Š–â€Šaren’t in order yet.

So, we’ll need to alphabetize again; this time, we’ll look at the second letter in each word, for any bucket that has multiple elements in it. If we consider Aloe and Agave, we know that Agave will need to be sorted as the first of these two elements, since g comes before l in the alphabet.

Now, we can imagine that we’d need to do the same thing again for the names in the F bucket, since there are three elements there that need to be put in order. We’ll sort by the second letter of Fig, Fern, and Ficus. Since e comes before i, we’ll stick Fern as the first element. But both Fig and Ficus have the same second letter! No worriesâ€Š–â€Šwe’ll just sort by the third letter in order to put them in their correct place.

Notice how we seem to be doing the exact same work, on smaller sections of our input data. You might be starting to see a pattern repeated here; if this were encapsulated into a method, it would be a recursive method call, since we’re doing the exact same sequence of work on a smaller subset of our data.

But more on that later. For now, let’s get back to our alphabetization! Both our A and F buckets are in sorted order. Last, we have our P bucket, which contains just one element: Palm. Hopefully, you remember this from our previous adventures in sorting, but just in case you need a refresher: a single element in a list is always considered sorted, since there is nothing else to compare it to! So, our P bucket is also sorted.

Now that all the names have been sorted by their first, second, and third letters, we can combine them back together again. The key here is joining the names back together in iterative order, starting with the first name in the first bucket.

Once we do this, we end up with a sorted list of houseplant names: Agave, Aloe, Fern, Ficus, Fig, Palm. Now, I still don’t know which of these plants I’ll buy for my house, but at least they’re finally in sorted order. Hooray!

Radix sort handles sorting one digit at aÂ time.

Okay, to be totally honest with you, this was less about alphabetizing plant names and more an example of how radix sort functions under the hood.

The steps that we took in order to sort a list of names is exactly the same set of steps that radix sort takes in order to sort a set of integers! Radix sort handles sorting by implementing counting sort (or bucket sort) on one digit at a time. And, it does this all in a very specific order, which we’ll learn more about in a moment. However, you might remember that radix sort is strictly an integer sorting algorithm. So, how can it do the same thing to strings? Well, if we assign an integer value to each letter in a string, we bend strings to fit the rule of integer-only radix sort. For example, we could assign each letter in the word Fig to an integer: F could be 0, i could be 1, and g could be 2. In this way, we can convert strings to work as integers, and then implement radix sort on them.

But, how does radix sort work on integers, exactly? Time to find out!

Putting down (radix) roots

In order to understand how radix sort functions on integers, we need to first decide which form of radix sort we want to look at. That’s rightâ€Š–â€Šthere are two different basic formats of the radix sort algorithm.

In our houseplant alphabetization example, we sorted our names by looking at the first letter of each word to start and grouped words with the same first letter together. Then, if two words shared the same first letter, we repeated this for the second letter of each word in that bucket.

This form of radix sort is known as most significant digit (MSD) radix sort. It functions by processing an integerâ€Š–â€Šor integer representation, if we’re dealing with stringsâ€Š–â€Šfrom the greatest digit, and moving towards the least significant digit as it sorts. In other words, it starts with the largest number first, and then moves on to continue sorting until it reaches the smallest digit. This method uses either counting sort or bucket sort under the hood, and is usually solved by recursion , which is exactly what we were doing when we did the exact same sequence of work on a smaller subset of our houseplant data.

The second form of radix sort is called least significant digit (LSD) radix sort, which works by processing the least significant, or the smallest digit first, moving towards the greater, more significant digit as it continues to sort. This method is similar to MSD in that it also uses counting or bucket sort internally; however, it is usually solved iteratively , and not recursively.

Since we’ve already seen how MSD radix sort works with our houseplant sorting example, let’s look at how LSD radix sort functions with an actual set of integers. In the example below, we’re sorting an array of integers: [10, 52, 5, 209, 19, 44]. Since we know that we’ll be dealing with the radix (or the digits) of each of these numbers, we can rewrite them so that they all have the same number of digits.

Rewriting our array elements to have the same number ofÂ digits

This isn’t part of radix sort, but rather just a way to make it easier for us to see what’s going on in each step of the algorithm.

Since the largest number has three digits, we’ll rewrite our array and add zeros where necessary so that every single element had three digits in it.

Rewritten, our array now looks like this: [010, 052, 005, 209, 019, 044].

Since we’re dealing with least significant digit radix sort, we know that our first step is going to be sorting byâ€Š–â€Šyou guessed it! –the least significant digit.

The least significant digit is the smallest digit in each number; if we’re dealing with base ten integers, the least significant digit will be number in the tens place.

Great, that should be pretty easy to identify. In the example shown here, the least significant digit, the tens place in each integer element, is highlighted in blue.

We’ll use counting sort in order to sort all of these integers; since we learned about counting sort last week, I won’t go into detail about how that’s actually working. If you need a refresher, you can always brush up on counting sort.

After we’ve run our dataset through counting sort using the tens place as our keys, our array is slightly more sorted. It now looks like this: [010, 052, 044, 005, 209, 019] . We’ll notice that it is sorted according to the tens place; in other words, the integer with a 0 in the tens place, 10, is first in the array, whereas the integers with a 9 in the tens place, 209 and 19, both are at the end of the array. We’ll also notice that 19 and 209 appear in the same order as they did in the original array.

However, we want to sort these integers as entire numbers, not just by the tens place! So, we’ll need to sort by the next place value: the hundreds.

In the illustration shown here, the number located at the hundreds place in each of these integers is highlighted in blue. We’ll repeat the same process as before, sorting these integers by running them through counting sort, using the value at the hundreds place in each number as the key for counting sort. After our second pass through our array, it now like this: [005, 209, 010, 019, 044, 052]. Notice how the numbers are now sorted by the second digit. The number 52, which has the largest value for the hundreds place with 5, is now at the end of the array.

Okay, time for the last stepâ€Š–â€Šfinally!

We’ll do the same thing for the last digit, which is the most significant, or the largest base digit: the thousands. In the illustration shown here, the value at each number’s thousands place is, again, highlighted in blue.

After we run counting sort this third time, all of our numbers are completely sorted, and we’re finished sorting for each base of all of our input numbers. Our array now looks like this: [005, 010, 019, 044, 052, 209].

If we clean it up by removing all the extra zeros that we added initially for ease of reading, here’s what our array looks like: [5, 10, 19, 44, 52, 209].

Oh hey! Would you look that? Our array is sorted. Pretty cool, right?

So, how did our input data change as we passed through it each time? Let’s take a look. In each pass through of radix sort, we can see how our array is transformed based upon which specific base digit we were sorting in that pass:

Notice also how we had to make three passes in order to completely sort our input data. The number of passes corresponds directly to the largest number we had in our input data: 209. Sure, we added extra zeros for padding, but ultimately, the reason that we had to make three passes was because at least one number had three digits in it.

This ends up being extremely important to how radix sort functions. It generally works best for a range of smallish numbers, since even just one number with an extra digit results in an entire extra pass through the array.

Radix sort’s time complexity atÂ runtime

In our case, we were dealing with only numbers that had a radix of three. Thus, we had to make three passes in order to sort our dataset. However, we can imagine that if we had to deal with numbers (or even just one number!) that had a radix of five, or ten, or fifteen…well, we’d have to make five, ten, or fifteen passes through our array in order to sort it.

This characteristic of radix sort explains its runtime, which is O(kn), where n is the total number of elements in the array to be sorted, and k is the number of digits, or the radix, of the largest number in the array.

A deep-rooted sorting history

Compared to other sorting algorithms, radix sort is the most like counting sort; if we think about it, it makes sense given the fact that radix sort often implements counting sort under the hood.

We already know that radix sort’s time complexity is O(kn). Generally speaking, it’s rare to find radix sort used unless k is a constant or close to the value of n. If k is a constant amountâ€Š–â€Šthat is to say, if we know the largest possible integer and therefore know it’s length before we run radix sortâ€Š–â€Šthen the actual runtime of the radix sort algorithm is close to linear.

Radix sort is generally implemented as an out-of-place algorithm, since it needs to create a second, copied array in order to handle the work of sorting by hashingâ€Š–â€Šsimilar to counting sort. It also doesn’t require additional external memory, which means that we can classify it as an internal sorting algorithm.

However, it is possible to implement an in-place radix sort algorithm, but such an implementation very quickly becomes similar to quicksort, and it also loses another important property: it’s stability. The radix sort algorithm handles the work of sorting by sorting one digit at a time; this ensures that numbers that appear before other numbers in the input array will maintain that same order in the final, sorted array; this makes radix sort a stable algorithm.

Finally, we know that radix sort is a non-comparison integer sorting algorithm, and that it can be implemented both recursively (MSD radix sort) and non-recursively (LSD radix sort). For the most part, radix sort is usually implemented in the least significant digit format, since this is slightly easier to code than the MSD format.

Usually, I would include a code example of how to implement a sorting algorithm, but given the fact that radix sort is so similar to counting sort, I decided to do something a little different this week!

I mentioned earlier that radix sort was the very first sorting algorithm to be created. Well, it dates back further than any of us probably realize!

The history of sorting algorithms comes from Herman Hollerith, an American inventor living in the late 1800’s. He was tasked with determining the official population count in 1890, and had to tally the census results within six months. He created a machine, termed the Hollerith machine, to help him solve the problem of tallying up and counting all this data.

Hollerith machines, The Art of Computer Programming

Around 1901, Hollerith came up with the idea to leverage his sorting machine and sort his data one column at a time; he would items in the first column, and then re-insert the punch tabulator into the machine and sort by the next column.

According to research in Milestones in Computer Science and Information Technology:

For use with one of his Hollerith machines of the late 1800s, Herman Hollerith developed an algorithm called Radix sort because it depends on multiple sort passes, one for each digit (radix) position in the maximum value number to be sorted.

Hollerith founded his own company, named the Tabulating Machine Company, which eventually became the Computer Tabulating Recording Company (CTR). He worked as an engineer there for many years. And guess what? His very company, CTR, changed its name in 1924 to the International Business Machines Corporation, or IBM.

The algorithm that we learned about today has been around for over a hundred years, making our lives so much easier! And now, we know how it works, and when to use it.

Sorting algorithms have a long, rich history. The story of computing wouldn’t be the same without them! So, next time you find yourself reading about a sorting algorithm, think of Herman Hollerith and the other people who figured out how to sort quickly and efficiently; take a moment to appreciate their hard work and bask in the joy of the fact that, because of them, we don’t have to worry about solving these tough problems anymore.

Resources

Radix sort is one of the most fun algorithms to learn about, if you can find good resources on it. I did some Googling for you, so that you wouldn’t have to lose yourself in the depths of search engine optimization. Here are some good places to continue learning about radix sort, the root of all sorting algorithms. Happy sorting!