Brian Neville-O'Neill

Posted on Nov 4, 2019 • Originally published at blog.logrocket.com on Nov 4, 2019

Radix sort: No comparisons required

#javascript #webdev #programming

Written by Glad China✏️

Sorting (arranging data in a particular sequence or order) is a very important operation in computer science, and as such, it is very rare to talk about computer algorithms without mentioning sorting algorithms. Practically speaking, there are so many ways in which data can be sorted, which is why so many sorting algorithms exist — merge sort, quicksort, insertion sort, heap sort, etc.

The efficiency of a sorting algorithm when compared with another may vary based on the initial condition of the data set — nearly sorted, sorted in reverse order, contains duplicates, etc. Likewise, some sorting algorithms are more efficient than others for larger data sets.

In this tutorial, however, we will consider a special kind of sorting algorithm called radix sort. We will take a look at how it works and how we can implement it with JavaScript.

Is comparison required?

Most of the popular sorting algorithms perform their sort by comparing items (what item is larger than the other) in the data set, which is likely the most logical approach when it comes to arranging items in sequence. Consider this list of numbers:

75, 48, 137, 61, 206, 43, 8, 239, 124

If we were to sort this list using the insertion sort algorithm, for example, we will iterate through the items starting with the second item (48) and then try to place each item in its correct sorted position by looking backwards at the elements before it, which usually requires some comparison.

Below are the results after each iteration of the insertion sort (the results for nested iterations are not shown).

75, 48, 137, 61, 206, 43, 8, 239, 124
48, 75, 137, 61, 206, 43, 8, 239, 124
48, 75, 137, 61, 206, 43, 8, 239, 124
48, 61, 75, 137, 206, 43, 8, 239, 124
48, 61, 75, 137, 206, 43, 8, 239, 124
43, 48, 61, 75, 137, 206, 8, 239, 124
8, 43, 48, 61, 75, 137, 206, 239, 124
8, 43, 48, 61, 75, 137, 206, 239, 124
8, 43, 48, 61, 75, 124, 137, 206, 239

Since most of the efficient sorting algorithms require some form of comparison between items, does it mean that comparison is always required for sorting? Well, the answer is no. When the data set contains only integers, especially, it is possible to sort the items without comparing them — using radix sort.

Radix sort

Radix sort sorts items by grouping them into buckets according to their radix. This makes radix sort ideal for sorting items that can be ordered based on their component digits or letters, such as integers, words, etc. The grouping into buckets does not involve any comparisons.

The radix sort algorithm starts the grouping into buckets with either the least or most significant digit of each item of the data set, and then collapses the items in the buckets into a new data set containing items that are sorted based on the digit at the start position — this is the first iteration. The process is repeated for the other digits in each item until the data set is completely sorted.

Radix sort example

Using our previous data set, below are the step-by-step results after each iteration of the radix sort until the data set is fully sorted.

// Initial data set
[75, 48, 137, 61, 206, 43, 8, 239, 124]

/* START ITERATION(#1) */
// 1. Group into buckets based on unit digit
// 2. Collapse items in buckets to form new data set
[[], [61], [], [43], [124], [75], [206], [137], [48, 8], [239]]
[61, 43, 124, 75, 206, 137, 48, 8, 239]
/* END ITERATION(#1) */

/* START ITERATION(#2) */
// 1. Group into buckets based on tens digit
// 2. Collapse items in buckets to form new data set
[[206, 8], [], [124], [137, 239], [43, 48], [], [61], [75], [], []]
[206, 8, 124, 137, 239, 43, 48, 61, 75]
/* END ITERATION(#2) */

/* START ITERATION(#3) */
// 1. Group into buckets based on hundreds digit
// 2. Collapse items in buckets to form new data set
[[8, 43, 48, 61, 75], [124, 137], [206, 239], [], [], [], [], [], [], []]
[8, 43, 48, 61, 75, 124, 137, 206, 239]
/* END ITERATION(#3) */

// Final sorted data set
[8, 43, 48, 61, 75, 124, 137, 206, 239]

You can see from the step-by-step process above that radix sort does not compare items at any point — no comparisons required. However, here are a few things to note from the above example:

Only positive integers

All items in the data set are positive integers. It is important to note that radix sort cannot be used to sort a data set containing non-integers (numbers with decimals). However, radix sort can be implemented to sort a data set consisting of both positive and negative integers.

Starts with the least significant digit

The first iteration groups the items into buckets based on their least significant digit, and then the iteration continues towards the most significant digit of each item. However, radix sort can be implemented to start the first iteration with the most significant digits instead.

Uses 10 buckets

On each iteration, 10 buckets are used because we are dealing with decimal (base 10) numbers. The buckets map to their corresponding digits in sequential order (0–9). Therefore, the number of buckets to be used depends on the radix (base) of the number system used for the items.

It is also important to notice that some buckets are empty for some iterations, which means that memory was allocated but never used to store anything — good optimization starting point.

Radix sort algorithm

Now that we have seen a simple example that demonstrates sorting a data set using radix sort, we can go ahead and describe the complete algorithm for radix sort as follows:

Get the maximum digits count of the largest number
Loop from k = 0 up to the maximum digits count. For each iteration:
- Create buckets for each digit (10 buckets for 0–9)
- Loop through the items, grouping them into buckets based on their _k_th digits.
- Collapse the items in the buckets (in order) to a flat array and update the current array reference with the new array
Return the sorted array

The algorithm above requires some helper functions to make the implementation seamless. So before we move on to implement radix sort, let’s define a couple of helper functions in the next section.

Radix sort helper functions

`asInteger()`

The first helper function is asInteger(), which is a simple utility function we will be using in subsequent helper functions. It takes a number as its argument, removes the decimal portion of the number using Math.trunc(), and returns the absolute (positive) representation of the result using Math.abs(). For example, asInteger(3.226) should return 3, while asInteger(-12.035) should return 12.

function asInteger(num) {
  return Math.abs(Math.trunc(num));
}

`digitAtPosition()`

The second helper function is digitAtPosition(), which takes a number (integer) and a zero-based position (integer) as its first and second arguments, and returns the digit at that position. The unit digit is at position 0, the tens digit at position 1, the hundreds digit at position 2, etc. For example, digitAtPosition(3705, 2) should return 7, since 7 is the hundreds digit of 3705.

function digitAtPosition(num, pos) {
  return Math.floor(asInteger(num) / Math.pow(10, asInteger(pos))) % 10;
}

This function uses the asInteger() function defined earlier to normalize the number input and the position input. It uses the truncated position integer to get a power of 10 with which to divide the number. Finally, it floors the result and returns the remainder when divided by 10.

`digitsCount()`

The third helper function is digitsCount(), which takes a number (integer) as its argument and returns the number of significant digits the integer has. For example, digitsCount(3705) should return 4, because 3705 has 4 significant digits: 3, 7, 0, and 5.

function digitsCount(num) {
  return ((num = asInteger(num)) === 0) ? 1 : Math.floor(Math.log10(num)) + 1;
}

Notice, again, that this function uses the asInteger() function defined earlier to ensure the number is properly truncated to a positive integer. It also uses Math.log10() to get the approximate power of 10 that equals the truncated number. To get the number of digits, it floors the logarithm using Math.floor() and adds 1 to the result.

Using Math.log10() introduces an edge case. When the input number is 0, it returns -Infinity. To handle this, the digitsCount() function returns 1 if the truncated number is 0, otherwise, it does the computations described above and returns the result.

`maxDigitsCount()`

The last helper function is maxDigitsCount(), which takes an array of numbers (integers) and returns the digitsCount() for the integer(s) in the array that have highest number of significant digits. For example, maxDigitsCount([12, 5, 3048, 620]) should return 4, since 3048 is the number in the array that has the highest number of significant digits (4).

function maxDigitsCount(nums) {
  return nums.reduce((max, num) => Math.max(max, digitsCount(num)), 0);
}

This function simply reduces the array of numbers passed to it and returns the final max value returned by the reducer function. It uses the digitsCount() function inside the reducer function to get the number of digits and update the max digits count as required.

Radix sort implementation

With our helper functions in place, we can now implement the radixSort() function. But just before we do that, it is important to note that our version of radix sort can only correctly sort a data set containing positive integers.

That said, the following code snippet shows our implementation of the radix sort algorithm:

function radixSort(arr) {
  const len = arr.length; // the length of the array
  const max = maxDigitsCount(arr); // the maximum digits count

  for (let k = 0; k < max; k++) {
    // initialize the buckets again for grouping
    // create an array of 10 buckets (one for each digit)
    const buckets = Array(10).fill([]);

    for (let i = 0; i < len; i++) {
      // get the digit at the kth position of the number
      // and push the number into the corresponding bucket
      // based on that digit
      buckets[digitAtPosition(arr[i], k)].push(arr[i]);
    }

    // collapse the items in the buckets to a flat array
    // updating the old array reference with the flat array
    // and continue to the next iteration
    arr = [].concat(...buckets);
  }

  // return the final sorted array
  return arr;
}

The implementation in itself is very simple and straightforward. However, there are a few portions of the code worth highlighting.

Creating buckets

The buckets are recreated (reset) at the beginning of each iteration. The buckets array, when recreated, consists of 10 empty arrays (one for each base-10 digit, 0–9). Here, we are using Array.prototype.fill() to fill the slots with empty arrays. However, here are some other ways you could do that:

// using spread operator and Array.prototype.map()
const buckets = [...Array(10)].map(() => []);

// using Array.from() and Array constructor, with map function
const buckets = Array.from(Array(10), () => []);

// using Array.from() and array-like object, with map function
const buckets = Array.from({ length: 10 }, () => []);

Pushing items to buckets

Inside the nested for loop, we are getting the digit at the _k_th position of the current number and also pushing into the correct bucket based on that digit. Given that the current number is 137 (arr[i] = 137) and the current digit position is 1 (k = 1), then this is what it looks like:

buckets[digitAtPosition(arr[i], k)].push(arr[i]);
// => buckets[digitAtPosition(137, 1)].push(137);
// => buckets[3].push(137);

Collapsing items in buckets

The items in the buckets are collapsed to a flat array at the end of each iteration and used to update arr. Here we are using Array.prototype.concat() to flatten the buckets array. It is important to pay attention to how the spread operator was used here:

const buckets = [[], [61], [], [43], [124], [75], [206], [137], [48, 8], [239]];

/* without spread operator */
[].concat(buckets); // [[], [61], [], [43], [124], [75], [206], [137], [48, 8], [239]]

/* with spread operator(...) */
[].concat(...buckets); // [61, 43, 124, 75, 206, 137, 48, 8, 239]

Sorting in alphabetical order

Let’s take our radix sort one step further. Let’s say we have a list of words that we want to arrange in alphabetical order. We can achieve this using radix sort. Here is a modified version of our radix sort function from before that sorts a list of words in alphabetical order.

const radixSortAlphabetical = (() => {
  const PADDING_CHAR = '_';
  const REPLACE_REGEX = /[^a-z]/ig;

  const CHARS = [PADDING_CHAR].concat([
    'a','b','c','d','e','f','g','h','i','j','k','l','m',
    'n','o','p','q','r','s','t','u','v','w','x','y','z'
  ]);

  function _maxStringLength(arr) {
    return arr.reduce((max, str) => Math.max(max || 0, str.replace(REPLACE_REGEX, '').length));
  }

  function _charAtPosition(str, pos, maxlength = pos) {
    str = str.replace(REPLACE_REGEX, '').toLowerCase();
    str += PADDING_CHAR.repeat(maxlength - str.length);
    return str.slice(-(pos + 1))[0];
  }

  return function _radixSort(arr) {
    const len = arr.length;
    const maxlength = _maxStringLength(arr);

    for (let k = 0; k < maxlength; k++) {
      const buckets = {};

      for (let i = 0; i < len; i++) {
        const char = _charAtPosition(arr[i], k, maxlength);
        buckets[char] = (buckets[char] || []).concat(arr[i]);
      }

      arr = CHARS.reduce((arr, char) => arr.concat(buckets[char] || []), []);
    }

    return arr;
  }
})();

Here, we used an immediately invoked function expression to encapsulate the sorting logic and return the sort function. The logic is quite similar to what we had before for integers, but with some minor differences to handle alphabets. Here are some of the modifications made:

Padded strings

During each iteration, each string is padded at the end with a padding character (underscore in this case) until the length of the string reaches the length of the longest string in the data set. This is to ensure that all the strings are of equal length before the grouping is done.

Characters sequence

The characters sequence contains only alphabetical characters in order (from a–z). However, the padding character (underscore in this case) comes before the letters in the characters sequence. This effectively means that all strings in the data set must contain only alphabetical characters for the sort to be predictable.

Buckets object

An object was used here to group the items into buckets. The characters are used as keys and the array of items as values. If there are no items in the group for a character, it is taken to be an empty array.

Grouping from last character

After the strings have been padded, the grouping starts with the last character in the string up to the first character. Note that because shorter strings are padded at the end, their last character will initially be the padding character.

Our radixSortAlphabetical() function works best when all the strings contain only alphabetical characters. Its behavior is highly unpredictable when other characters like numbers and symbols are present. However, the function can be improved to scale beyond some of these limitations.

Conclusion

Radix sort is a non-comparative sorting algorithm unlike the popular comparison sorts. At worst, the time complexity for the radix sort is O(k•n) where k is the number of iterations and n is the number of items, which is linear and preferable to sorts with logarithmic complexity.

However, the performance of the radix sort is heavily influenced by variations in the digits count or component size of the items. Radix sort uses a lot of space in creating new arrays or objects for grouping items.

Also, it does not sort the array in place, but returns a sorted copy of the array. Hence, for very large data sets, where space optimization is a requirement, you should consider other sorting algorithms. Though we were able to come up with basic implementations of radix sort in this tutorial, it is possible to improve the implementations to scale beyond most of the inherent limitations.

Thanks for making time to go through this tutorial. I am really glad that you made it to the end, and do hope it was worth your time.

Editor's note: Seeing something wrong with this post? You can find the correct version here.

Plug: LogRocket, a DVR for web apps

LogRocket is a frontend logging tool that lets you replay problems as if they happened in your own browser. Instead of guessing why errors happen, or asking users for screenshots and log dumps, LogRocket lets you replay the session to quickly understand what went wrong. It works perfectly with any app, regardless of framework, and has plugins to log additional context from Redux, Vuex, and @ngrx/store.

In addition to logging Redux actions and state, LogRocket records console logs, JavaScript errors, stacktraces, network requests/responses with headers + bodies, browser metadata, and custom logs. It also instruments the DOM to record the HTML and CSS on the page, recreating pixel-perfect videos of even the most complex single-page apps.

Try it for free.

The post Radix sort: No comparisons required appeared first on LogRocket Blog.

DEV Community