Tamerlan Gudabayev

Posted on Jan 31, 2021 • Edited on Oct 20, 2021

Data Structures and Algorithms: Introduction

#beginners #computerscience #algorithms #theory

I know you work hard.

You got 100 different side-projects.

Your GitHub is all green from those commits.

Then your day comes and finally, get that interview.

You watch a couple of videos on technical interviews.

Reality hits...

You don't know shit about data structures and algorithms.

Don't worry I was in your shoes a couple of days ago, and I failed...

But I don't want you to fail.

So in this new series, I'm gonna be your new data structures and algorithms teacher.

PS. Unfortunately, I don't work at google **yet, but I still know a thing or two about the subject.

Why Learn Data Structures
What are data structures?
- Abstract Data Types vs Data Structures
Computational Complexity Analysis
- Big-O Notation
- Big-O Notation Properties
Time Complexity
Conclusion
Additional References

Why learn data structures?

This is a very valid question because we live in an age where your gonna be using library functions for 90% of your code. It mainly comes down to four reasons:

To understand the logic behind the behavior of these abstractions. If you don't you will eventually apply them in a way that will hurt you.
Understanding algorithms will help you understand the problems those algorithms were built to overcome. That understanding will help you immensely in your day-to-day work because those problems are basic logic and computer science problems that pop up everywhere, no matter if you're working on business software, hardware drivers, games, or mobile apps.
You need to understand algorithms and data structures because I've seen code written by people who didn't; and trust me, you don't want to be that guy.
Most top product based companies ask questions relating to algorithms and data structures. You may think why are interviews so focused on DSA instead of language/framework/tools? Because these companies don't have your average problems, they deal with a lot of harder problems and at a much larger scale. So it's not about solving the problem, it's more about solving it right.

What are data structures?

Remember when you were a kid and your parent's scolded you because you couldn't find your clothes in your messy room. Turns out your parents were right in saying to keep your room tidy, so that next time you need to find anything it will be much easier. We as humans already know why is it important to organize(structure) our stuff(data) in real life.

Another great example is a library. Imagine that you need to find a book on Russian history, you will first go to the history section, then the Russian section. If these books weren't organized you wouldn't be able to find your book easily, it would've been a very frustrating process.

So in short data structures refers to the way we organize our data.

Let us consider some examples:

Facebook (Your favorite app). All your friends, family, mutual friends, and friends of friends can be represented by a graph. Think about it for a second, it makes total sense to represent friends on Facebook like that.
Imagine you have a deck of cards and you need to arrange it properly. You can either throw cards randomly or arrange the cards one over the over and form a proper deck. Here you can use a stack to make a proper arrangement of cards over the other.
If you want to find a word in the dictionary, how would you do it? Do you go page by page or open the middle and if the word is not found then you go prior/later page. We use the binary search here to efficiently find the word.

In the first two examples, we chose the right data structure to represent real-world data, while the third example chose the right algorithm to save us time and space.

Abstract Data Types vs Data Structures

An abstract data type is an abstraction of a data structure that provides only the interface to which a data structure must adhere to.

The interface does not give any specific details about how something should be implemented or in what programming language.

Simply put, an ADT (Abstract Data Type) is more of a logical description, while a Data Structure is concrete.

Think of an ADT as a picture of the data and the operations to manipulate and change it.

A Data Structure is the real, concrete thing. It can be implemented and used within an algorithm.

Computational Complexity Analysis

As programmers we keep asking these two questions:

How much time does this algorithm need to finish?
How much space does this algorithm need for its computation?

Basically, it would be bad if:

If your algorithm is small but takes the time of the universe to compute.
Your algorithm is fast but takes the power of all machines on earth to compute.

Big-O Notation

Big-O Notation gives the upper bound of the complexity in the worst-case scenario, helping to quantify performance as input size becomes arbitrarily larger.

Basically, it shows you the worst-case scenario of your algorithm. For example, you may have a sorting algorithm. Then the big-o notation will get the complexity based on a huge list.

Another example may be, you got an algorithm to find the number 7. Big-O will get the computation but it will assume that the number 7 is at the end of the list.

Big-O doesn't care about small input, it only cares about large inputs.

Big-O Notation Table (from best to worst) (n = input size)

Big-O Notation Properties

The first two properties explain why we can simply remove constant values (c), because if you're adding a constant to infinity that just equals infinity. Same thing with multiplication.

But this is just theoretical, in the real world if your constant is 2 billion, you can't ignore that.

The second half basically introduces us a mathematical function and show's us how to get the Big-O Notation. The basic premise is to simply take the biggest n, in our case it is n^3.

Time Complexity

Now we will be breaking down the basics of time complexity (constant, linear, quadratic, exponential, and logarithmic). There are others but once you get the main ones, the others make sense.

O(1) — Constant

Constant time complexity describes an algorithm that will always take the same time (or space) regardless of the input size. For example in Javascript, this can be as simple as accessing a specific index within an array:

var array = [1, 2, 3, 4, 5];
array[0] // This is a constant time look-up

var numberOfElementsToRemove = 2;while (numberOfElementsToRemove > 0) {
    array.pop();
   //This will also have a constant time look-up since the function 
   //is only looking at a specific reference point within the array.
  }
};

It doesn't matter if the input is 5 or 5 million it will still take the same amount of time to compute.

O(n) — Linear Time

Linear time describes an algorithm that it's computation power is proportional to the input size. For, example if you want to print out all elements in an array, logically speaking the larger the array the longer it will take to compute.

var numberRange = function(array) {
  for (var i = 1; i < array.length; i++) {
    console.log(array[i]);
  }
};

O(log n) — Logarithmic Time

Logarithmic time describes an algorithm that runs proportionally to the logarithm of its input. I know it's a bit confusing at first, but let me give you an example. You want to find some name in a phone book, you obviously wouldn't need to check all the names one by one. You would simply divide-and-conquer by looking based on where their name is alphabetically. (This is a binary search and that's why its time complexity is O(log n)

This time complexity has two properties:

The choice of the next element on which to perform some action is one of several possibilities.
Only one will need to be chosen.

for (var i = 1; i < n; i = i * 2){
    console.log("Hey - I'm busy looking at: " + i);
}

If n is 8, the output will be the following:

Hey - I'm busy looking at: 1
Hey - I'm busy looking at: 2
Hey - I'm busy looking at: 4

Our simple algorithm ran log(8) = 3 times.

To find more examples simply look at algorithms that use divide-and-conquer, like binary search, merge sort, quick sort, etc...

In general, this algorithm is better than O(n), it makes sense because log(n) is always gonna be smaller than n, much smaller.

O(n^2) — Quadratic Time

Quadratic time describes an algorithm that runs proportionally to the squared of the input size. Usually, this is absolutely horrible, but to be honest, if you don't have a huge input it doesn't really matter.

function quadraticSum(num){
    for (i = 0; i <= num; i++){
               for (j = 0; j < num; j++){
                   sum += j;
               }
           }
}

In this example, we are looking up the array twice for every n basically n^2. Examples I think can be found in algorithms where you look up twice, like bubble sort.

O(2^n) — Exponential Time

Exponential Time complexity denotes an algorithm whose growth doubles with each addition to the input data set. If you know of other exponential growth patterns, this works in much the same way. The time complexity starts off very shallow, rising at an ever-increasing rate until the end. Algorithms with exponential time are often recursive algorithms that solve a problem of size n by recursively solve a sub-problem of size n-1.

var fibonacci = function(number) {
  if (number <= 1) return number;
  return Fibonacci(number - 2) + Fibonacci(number - 1);
}; //This will have an exponential time look-up since the function 
   //is looking at a every index an exponential number of times.

Fibonacci numbers are a great way to practice your understanding of recursion. Although there is a way to make the Fibonacci function have a shorter time complexity, for this case we will be using the double recursive method to show how it has an Exponential Time Complexity. Here we are solving the main problem of size n by recursively calling the same function with the size of n-1 and n-2.

Conclusion

I know we haven't talked about any specific data structure or some algorithm, but I believe it's necessary to talk about the foundations first. Anyway, I hope you learned something today, in the near future we will start breaking down specific data structures in their own post.

PS. Feel free to leave any questions down below, I would love to answer them 😃

Additional References

Timescale – the developer's data platform for modern apps, built on PostgreSQL

Timescale Cloud is PostgreSQL optimized for speed, scale, and performance. Over 3 million IoT, AI, crypto, and dev tool apps are powered by Timescale. Try it free today! No credit card required.

Try free