Data Structure and Algorithms 102: Deep Dive into Data Structure and Algorithms

Algorithmic knowledge combined with a good understanding of the implementation of Data Structures is a perfect blend for anyone who aspires to work in the IT sector. In this topic I am going to cover a bit of a deep dive into data structures and algorithms. There are multiple algorithms and the general ones are:

Greedy Algorithm
Dynamic Programming
Divide & Conquer Algorithm
Branch and Bound Algorithm
Backtracking
Brute-Force Algorithm

There are also other algorithms such as searching algorithms, sorting Algorithms and domain-specific ML Algorithms.

Searching Algorithms:
Linear Search — O(n)
Binary Search — O(log n)
**
Sorting Algorithms:**
Selection Sort
Insertion Sort
Bubble Sort
Merge Sort
Quick Sort
Heap Sort
Radix Sort
Bucket Sort

Asymptotic Notations
Asymptotic notations are the mathematical notations used to describe the running time of an algorithm when the input tends towards a particular value or a limiting value.

For example: In bubble sort, when the input array is already sorted, the time taken by the algorithm is linear i.e. the best case.

But, when the input array is in reverse condition, the algorithm takes the maximum time (quadratic) to sort the elements i.e. the worst case.

When the input array is neither sorted nor in reverse order, then it takes average time. These durations are denoted using asymptotic notations.

There are mainly **three **asymptotic notations:

Big Theta (Θ)
Big Oh(O)
Big Omega (Ω)
Big-O Notation (O-notation)

Big-O notation represents the upper bound of the running time of an algorithm. Thus, it gives the worst-case complexity of an algorithm.

It tells us that a certain function will never exceed a specified time for any value of input.

Since it gives the worst-case running time of an algorithm, it is widely used to analyze an algorithm as we are always interested in the worst-case scenario.

Omega Notation (Ω-notation)
Omega notation represents the lower bound of the running time of an algorithm. Thus, it provides the best case complexity of an algorithm.

This always indicates the minimum time required for any algorithm for all input values, for the time complexity for any algorithm in the form of big-Ω, we mean that the algorithm will take at least this much time to complete it’s execution.

Theta Notation (Θ-notation)
Theta notation encloses the function from above and below. Since it represents the upper and the lower bound of the running time of an algorithm, it is used for analyzing the average-case complexity of an algorithm.

**Space complexity **is the amount of memory used by the algorithm (including the input values to the algorithm) to execute and produce the result. Sometime Auxiliary Space is confused with Space Complexity. But Auxiliary Space is the extra space or the temporary space used by the algorithm during it’s execution.

Space Complexity = Auxiliary Space + Input space

Memory Usage while Execution
While executing, algorithm uses memory space for three reasons:

Instruction Space It’s the amount of memory used to save the compiled version of instructions.

2.Environmental Stack

Sometimes an algorithm(function) may be called inside another algorithm(function). In such a situation, the current variables are pushed onto the system stack, where they wait for further execution and then the call to the inside algorithm(function) is made.

Data Space

Amount of space used by the variables and constants.

while calculating the Space Complexity of any algorithm, we usually consider only Data Space and we neglect the Instruction Space and Environmental Stack.

Calculating the Space Complexity
For calculating the space complexity, we need to know the value of memory used by different type of datatype variables, which generally varies for different operating systems, but the method for calculating the space complexity remains the same.

bool, char, unsigned char, signed char, __int8 –1 byte

_int16, short, unsigned short, wchar_t, __wchar t — 2 bytes

float, __int32, int, unsigned int, long, unsigned long — 4 bytes

double, __int64, long double, long long — 8 bytes

example

{
    int z = a + b + c;
    return(z);
}

a, b, c and z are all integer types, hence they will take up 4 bytes each, so total memory requirement will be (4(4) + 4) = 20 bytes, this additional 4 bytes is for return value. And because this space requirement is fixed for the above example, hence it is called Constant Space Complexity.

int sum(int a[], int n) { int x = 0;// 4 bytes for x for(int i = 0; i < n; i++) // 4 bytes for i { x = x + a[i]; } return(x); }

In the above code, 4*n bytes of space is required for the array a[] elements.
4 bytes each for x, n, i and the return value.
Hence the total memory requirement will be (4n + 12), which is increasing linearly with the increase in the input value n, hence it is called as Linear Space Complexity.

Time Complexity of Algorithms

Time complexity of an algorithm represents the amount of time required by the algorithm to run to completion. Time requirements can be defined as a numerical function T(n), where T(n) can be measured as the number of steps, provided each step consumes constant time.

calculating time complexities examples
Constant Time — O(1)
An algorithm is said to have a constant time when it is not dependent on the input data (n). No matter the size of the input data, the running time will always be the same. For example:

if a > b: return True else: return False
Now, let’s take a look at the function get_first which returns the first element of a list:
`
def get_first(data):
return data[0]

if name == 'main':
data = [1, 2, 9, 8, 3, 4, 7, 6, 5]
print(get_first(data))`

Independently of the input data size, it will always have the same running time since it only gets the first value from the list.

An algorithm with constant time complexity is excellent since we don’t need to worry about the input size.

Logarithmic Time — O(log n)
An algorithm is said to have a logarithmic time complexity when it reduces the size of the input data in each step (it don’t need to look at all values of the input data), for example:

for index in range(0, len(data), 3): print(data[index])
Algorithms with logarithmic time complexity are commonly found in operations on binary trees or when using binary search . Let’s take a look at the example of a binary search, where we need to find the position of an element in a sorted list:

`def binary_search(data, value):
n = len(data)
left = 0
right = n - 1
while left <= right:
middle = (left + right) // 2
if value < data[middle]:
right = middle - 1
elif value > data[middle]:
left = middle + 1
else:
return middle
raise ValueError('Value is not in the list')

if name == 'main':
data = [1, 2, 3, 4, 5, 6, 7, 8, 9]
print(binary_search(data, 8))`

Steps of the binary search:

calculate the middle of the list.
If the searched value is lower than the value in the middle of the list, set a new right bounder.
If the searched value is higher than the value in the middle of the list, set a new left bounder.
If the search value is equal to the value in the middle of the list, return the middle (the index).
Repeat the steps above until the value is found or the left bounder is equal or higher the right bounder. It is important to understand that an algorithm that must access all elements of its input data cannot take logarithmic time, as the time taken for reading input of size n is of the order of n.

other time complexities include:

Constant Time O(1),Logarithmic Time O(log n) , Linear Time O(n) ,Quasilinear Time O(n log n), Quadratic Time O(n²) Exponential Time O(2^n) Factorial Time O(n!).