loading...
Cover image for Part 2: Sorting Algorithm

Part 2: Sorting Algorithm

yogeswaran79 profile image Yogeswaran ・8 min read

Merge Sort

Merge sort is divided and conquer algorithm. It divides input array into two halves, calls itself for the two halves and then merges the two sorted halves. The merge() function is used for merging two halves. The merge(arr,l,m,r) is a key process that assumes that arr[l..m] and arr[m+l..r] are sorted and merges the two sorted sub-arrays into one.

def mergeSort(arr):
    if len(arr) > 1:
        mid = len(arr)
        L = arr[:mid]
        R = arr[mid:]
        mergeSort(L)
        mergeSort(R)
        i = j = k = 0
        while i < len(L) and j < len(R):
            if L[i] < R[j]:
                arr[k] = L[i]
                i += 1
            else:
                arr[k] = R[j]
                j += 1
            k += 1
        while i < len(L):
            arr[k] = L[i]
            i += 1
            k += 1
        while i < len(R):
            arr[k] = R[j]
            j += 1
            k += 1

def printList(arr):
    for i in range(len(arr)):
        print(arr[i], end=" ")
    print()

if __name__ == '__main__':
    arr = [12, 11, 13, 5, 6, 7]
    print("Given array is", end="\n")
    printList(arr)
    mergeSort(arr)
    print("Sorted array is", end="\n")
    printList(arr)


# Output
Given array is
12 11 13 5 6 7
Sorted array is
5 6 7 11 12 13

The below function is recursive, hence it uses function call stack to store intermediate values of l and h. The function call stack stores other bookkeeping information together with parameters. Also, function calls involve overheads like staring activation record of the caller function and then resuming execution.

def merge(left, right):
    if not len(left) or not len(right):
        return left or right
    result = []
    i,j = 0,0
    while (len(result) < len(left) + len(right)):
        if left[i] < right[j]:
            result.append(left[i])
            i += 1
        else:
            result.append(right[j])
            j += 1
        if i == len(left) or j == len(right):
            result.extend(left[i:] or right[j:])
            break
    return result

def mergesort(list):
    if len(list) < 2:
        return list
    middle = len(list)/2
    left = mergesort(list[:middle])
    right = mergesort(list[middle:])
    return merge(left, right)
seq = (12, 11, 13, 5, 6, 7)
print("Given array is")
print(seq)
print("\n")
print("Sorted array is")
print(mergesort(seq))


# Output
Given array is
12 11 13 5 6 7
Sorted array is
5 6 7 11 12 13

Unlike Iterative Quick Sort, the interactive Merge Sort doesn't require an explicit auxiliary stack.

def mergeSort(a):
    current_size = 1
    while current_size < len(a) - 1:
        left = 0
        while left < len(a) - 1:
            mid = left + current_size - 1
            right = ((2 * current_size + left - 1,
                    len(a) - 1) [2 * current_size + left - 1 > len(a) - 1])
        merge(a, left, mid, right)
        left = left + current_size * 2
    current_size = 2 * current_size

def merge(a, l, m, r):
    n1 = m - l + 1
    n2 = r - m
    L = [0] * n1
    R = [0] * n2
    for i in range(0, n1):
        L[i] = a[l + i]
    for i in range(0, n2):
        R[i] = a[m + i + 1]
    i, j, k = 0, 0, l
    while i < n1 and j < n2:
        if L[i] > R[j]:
            a[k] = R[j]
            j += 1
        else:
            a[k] = L[i]
            i += 1
        k += 1
    while i < n1:
        a[k] = L[i]
        i += 1
        k += 1
    while j < n2:
        a[k] = R[j]
        j += 1
        k += 1

a = [12, 11, 13, 5, 6, 7]
print("Given array is")
print(a)
mergeSort(a)
print("Sorted array is")
print(a)


# Output
Given array is 
12 11 13 5 6 7
Sorted array is
5 6 7 11 12 13

HeapSort

Heapsort is a comparison based sorting technique based on a Binary Heap data structure. It's similar to selection sort where we first find the maximum element and place the maximum element at the end.

What is Binary Heap?
A binary heap is a heap data structure that takes the form of a binary tree. Binary heaps are a common way of implementing priority queues.

A binary heap is defined as a binary tree with two additional constraints:

  1. Shape property: a binary heap is a complete binary tree; that is, all levels of the tree, except possibly the last one (deepest) are fully filled, and, if the last level of the tree is not complete, the nodes of that level are filled from left to right.
  2. Heap property: the key stored in each node is either greater than or equal to (≥) or less than or equal to (≤) the keys in the node's children, according to some total order.

Heaps, where the parent key is greater than or equal to (≥) the child keys, are called max-heaps; those where it is less than or equal to (≤) are called min-heaps. (Read more on Wikipedia)

Why array-based representation for Binary Heap?
Since a Binary Heap is a complete binary tree, it can be easily represented as an array and array-based representation is space efficient. If the parent node is stored at index 1, the left child can be calculated by 2*I+1 and right child by 2*I*2.

Heap Sort algorithm for sorting in increasing order:

  1. Build a max heap from the input data.
  2. The largest item is stored at the root of the heap. Replace it with the last item of the heap followed by reducing the size of a heap by 1. After that, heapify the root of the tree.
  3. Repeat the above steps while the size of a heap is greater than 1.
def heapify(arr,n,i):
    largest = i
    l = 2*i+1
    r = 2*i+2
    if l < n and arr[i] < arr[l]:
        largest = l
    if r < n and arr[largest] < arr[r]:
        largest = r
    if largest != i:
        arr[i],arr[largest] = arr[largest],arr[i]
        heapify(arr,n,largest)

def heapSort(arr):
    n = len(arr)
    for i in range(n,-1,-1):
        heapify(arr,n,i)
    for i in range(n-1,0,-1):
        arr[i], arr[0] = arr[0], arr[i]
        heapify(arr,i,0)

arr = [12, 11, 13, 5, 6, 7]
heapSort(arr)
n = len(arr)
print("Sorted array is")
for i in range(n):
    print("%d" %arr[i])


# Output
Sorted array is
5 6 7 11 12 13

Counting Sort

Counting sort is a sorting technique based on keys between a specific range. It works by counting the number of objects having distinct key values. Then doing some arthmetic to calculate the position of each object in the output sequence.

def countSort(arr):
    output = [0 for i in range(256)]
    count = [0 for i in range(256)]
    ans = ["" for _ in arr]
    for i in arr:
        count[ord(i)] += 1
    for i in range(256):
        count[i] += count[i-1]
    for i in range(len(arr)):
        output[count[ord(arr[i])] -1] = arr[i]
        count[ord(arr[i])] -= 1
    for i in range(len(arr)):
        ans[i] = output[i]
    return ans

arr = "Yogeswaran"
ans = countSort(arr)
print("Sorted character array is %s" %("".join(ans)))


# Output
Sorted character array is Yaaegnorsw

Radix Sort

Radix sort is a non-comparative integer sorting algorithm that sorts data with integer keys by grouping keys by the individual digits which share the same significant position and value. A positional notation is required, but because integers can represent strings of characters (e.g., names or dates) and specially formatted floating point numbers, radix sort is not limited to integers.(Source Wikipedia)

You can check the example of Radix Sort Algorithm here

Bucket Sort

Bucket sort is mainly useful when input is uniformly distributed over a range.
Example of Bucket Sort

Picture from SlideShare.net
def insertionSort(b):
    for i in range(1, len(b)):
        up = b[i]
        j = i - 1
        while j >= 0 and b[j] > up:
            b[j+1] = b[j]
            j -= 1
        b[j+1] = up
    return b

def bucketSort(x):
    arr = []
    slot_num = 10
    for i in range(slot_num):
        arr.append([])
    for j in x:
        index_b = int(slot_num * j)
        arr[i] = insertionSort(arr[i])
    k = 0
    for i in range(slot_num):
        for j in range(len(arr[i])):
            x[k] = arr[i][j]
            k += 1
    return x

x = [0.897, 0.565, 0.656,
    0.1234, 0.665, 0.3434]
print("Sorted array is")
print(bucketSort(x))


# Output
Sorted array is
0.1234 0.3434 0.565 0.656 0.665 0.897

ShellSort

Shellsort, also known as Shell sort or Shell's method, is an in-place comparison sort. It can be seen as either a generalization of sorting by exchange (bubble sort) or sorting by insertion (insertion sort). The method starts by sorting pairs of elements far apart from each other, then progressively reducing the gap between elements to be compared. Starting with far apart elements, it can move some out-of-place elements into position faster than a simple nearest neighbor exchange. (Source: Wikipedia)

def shellSort(arr):
    n = len(arr)
    gap = n
    while gap > 0:
        for i in range(gap,n):
            temp = arr[i]
            j = i
            while j >= gap and arr[j-gap] > temp:
                arr[j] = arr[j-gap]
                j -= gap
            arr[j] = temp
        gap

arr = [12, 34, 54, 2, 3]
n = len(arr)
print("Array before sorting")
for i in range(n):
    print(arr[i])
shellSort(arr)
print("\nArray after sorting")
for i in range(n):
    print(arr[i])


# Output
Array before sorting
12 34 54 2 3
Array after sorting
2 3 12 34 54

TimSort

Timsort is a hybrid stable sorting algorithm, derived from merge sort and insertion sort, designed to perform well on many kinds of real-world data.

  1. A stable sorting algorithm works in O(nLogn) time.
  2. USed in Java's Arrays.sort() as well as Python's sorted() and sort().
  3. First sort small pieces using insertion sort, then merges the pieces using merge of merge sort.
RUN = 32

def insertionSort(arr, left, right): 
    for i in range(left + 1, right+1): 
        temp = arr[i] 
    j = i - 1
    while arr[j] > temp and j >= left: 
        arr[j+1] = arr[j] 
        j -= 1
    arr[j+1] = temp 

def merge(arr, l, m, r): 
    len1, len2 = m - l + 1, r - m 
    left, right = [], [] 
    for i in range(0, len1): 
        left.append(arr[l + i]) 
    for i in range(0, len2): 
        right.append(arr[m + 1 + i]) 
    i, j, k = 0, 0, l 
    while i < len1 and j < len2: 
        if left[i] <= right[j]: 
            arr[k] = left[i] 
        i += 1
    else: 
        arr[k] = right[j] 
        j += 1
        k += 1 
    while i < len1: 
        arr[k] = left[i] 
        k += 1
        i += 1
    while j < len2: 
        arr[k] = right[j] 
        k += 1
        j += 1
def timSort(arr, n): 
    for i in range(0, n, RUN): 
        insertionSort(arr, i, min((i+31), (n-1))) 
    size = RUN 
    while size < n: 
        for left in range(0, n, 2*size): 
        mid = left + size - 1
        right = min((left + 2*size - 1), (n-1)) 
        merge(arr, left, mid, right) 
        size = 2*size 

def printArray(arr, n): 
    for i in range(0, n): 
        print(arr[i], end = " ") 
    print() 

if __name__ == "__main__": 
    arr = [5, 21, 7, 23, 19] 
    n = len(arr) 
    print("Given array is") 
    printArray(arr, n) 
    timSort(arr, n) 
    print("After sorting array is") 
    printArray(arr, n)


# Output
Given array is
5 21 7 23 19
After sorting array is
5 7 19 21 23

Join my Community

Posted on by:

Discussion

pic
Editor guide
 

Tim sort is quick if it's already sorted

 

Faster than "TimSort" and challenged "IntroSort".
Please read if you are interested.