Note: This will be a series of articles on sorting Algorithms
Sorting can be defined as the arranging of the elements in a collection or list in increasing or decreasing order of some defined property. There are reasons why we may want to sort data or arrange things in a certain order, one reason could be for better readability of the data and another reason could be just to search or extract information from the data or collection. For example, if we visit the Today’s Deals page on Amazon, there is an option in the form of a dropdown to sort the products on that page by different criteria. The sorting criteria includes; featured, Price-Low to High, Price-High to Low, Discount - Low to High, DIscount High to Low. Selecting any of these options will re-arrange the products by the selected option. In this case, we have used sorting to re-arrange how we want to view the products on the page. Another example of where we might like to keep our data sorted is a Language Dictionary, where we want to keep the words sorted so that searching for a word in the dictionary is easy. Complex computer programs such as searching files on a computer, compressing data and finding the shortest route to a destination are founded upon some form of sorting algorithm.
Now that we understand what sorting is and ways that it can be implemented, we can now dive into the algorithms that perform the sorting. Sorting a list or collection require that the list should be homologous, that is all the elements in the list should all be of the same type. Most times we use a list of integers and typically would sort the list of integers in increasing or decreasing order of value. For example given the following list of integers: 2, 3, 9, 4, 6, we could sort this list based on some property:
Input: 2, 3, 9, 4, 6
A. Output: 2, 3, 4, 6, 9 => sorted based on increasing order of value
B. Output: 9, 6, 4, 3, 2 => sorted based on decreasing order of value
C. Output: 2, 3, 9, 4, 6 => sorted based on increasing order of number of factors
In the first case above, we sorted the list based on the increasing order of the items in the list.
In the second case, we sorted the list based on the decreasing order of the items in the list.
In the third case, we sorted the list based on the increasing order of the number of factors that each integer has. We can also sort a list of strings or words in lexicographical order and may also need to sort very complex data types.
A sorting algorithm is an algorithm made-up of a series of instructions that takes an array as input, performs specified operations on the array, sometimes called a list, and outputs a sorted array. - brilliant.org
Why do we need to worry about sorting data and why has years of research and studies been invested on finding the fastest and efficient sorting algorithms? The teaching and understanding of sorting algorithms help introduce other key computer science concepts like Big-O notation, divide-and-conquer methods, and data structures such as binary trees, and heaps. Sorted data is very helpful for computational power of machines.
If we store a list in computer memory as unsorted and we want to search for an item in that list, we would have to do a linear search, starting from the first item in the list and keep scanning until we find the item we are looking for. In the case we didn’t find the item in the list (our worst case scenario) we would have searched through the whole list comparing each item to the item we are looking for. So, if there are n items in the list, in the worst case scenario, we will make n comparisons. Imagine the amount of data handled by the modern day computer, so if we take n = 2⁶⁴, and assuming it takes 1 millisecond to compute one comparison, the computational time for the comparison in a worse case scenario will equal 2⁶⁴ milliseconds. I’m not going to try and convert that to hours, days or even years, but we can all agree that it will take some years for that computation to be complete. Nobody wants to sit around for that long waiting for some computation to be over, as we all know, time is valuable. However, if we store our list as a sorted list in computer memory, then we can use a binary search to find our item. In this case, for our n size list it will take log₂n milliseconds for the computation to complete.
If we take n = 2⁶⁴;
log₂n => log₂2⁶⁴
=> 64 * 1 => 64
(Note: log₂2 is equal to 1).
Therefore, it will take 64 milliseconds to complete the comparison computation. Wow! What a difference, right?! With a sorted list we have been able to save huge amount of time and can move on to other important things.
The importance of sorting as a problem has led to the design of different sorting algorithms we have today. Some of these sorting algorithms include;
- Bubble sort
- Selection sort
- Insertion sort
- Merge sort
- Heap sort
- Counting sort
- Radix sort.
Sorting algorithms are often classified based on different parameters, some of which are listed below;
- Time Complexity: This is the rate of growth of time taken by an algorithm with respect to input size. Some algorithms are relatively faster than others. The running time of an algorithm describes the number of operations an algorithm must perform before it completes.
- Space Complexity or Memory Usage: This parameter further classifies sorting algorithm into In-Place sorting and Not-In-Place sorting. -- In-Place Sorting Algorithm: This sorting algorithm uses constant amount of extra memory to rearrange items in a list. It sorts the array by modifying the element order directly within the array. No extra data structure is used. Example of sorting algorithms with In-Place sorting are Bubble Sort, Selection Sort, Insertion Sort, Heap Sort. -- Not-In-Place Sorting Algorithm: This sorting algorithm uses extra data structure to sort the array therefore uses extra space or memory. Examples include Merge Sort which require an additional O(n) space to perform the merge operation, and Quick sort.
- Stability: A stable sorting algorithm in the case of equality of key, or property used to define the sorting, preserves the relative order of elements. Therefore, If there are identical items in the list to be sorted, this sorting algorithm ensures that these identical items are sorted in the same order they appear in the input. This is guaranteed by a stable sorting algorithm. Examples are Bubble Sort, Insertion Sort, Merge Sort, Counting Sort. In an unstable sort, the order of identical elements can not be guaranteed to stay the same in the output as they appear in the input. Examples include Quick Sort, Heap Sort, Selection Sort. Generally, an unstable sorting algorithm is faster than a stable sorting algorithm.
- Internal and External: When all the data to be sorted are in the main memory or RAM, then such a sort is internal sort. Mostly use for small dataset. An external sort is when the dataset are on an auxiliary storage like hard disk often because it’s not possible to have them all in memory at once. Usually used to handle huge amount of data at a time. Recursive or Non-Recursive: Some sorting algorithms are recursive, e.g. Quick Sort and Merge Sort (They both use the divide-and-conquer approach to sort an array) while some are non recursive like Insertion Sort and Selection Sort.