DEV Community

Cover image for Where art thou data ?
Samriddha Chattopadhyay
Samriddha Chattopadhyay

Posted on

Where art thou data ?

Today, data is the second most important thing (next to our life, that is) in our lives.

And this is true even in programming.

The way you store your data makes all the difference in how your program executes and how it simplifies everything for you.

Data structures are some abstract things that let you store data and that too, in different formats.

Sometimes, you want to make a shopping list where things come in order whereas, sometimes, you just want to put things into a bag and get them at once without going through all of them. Sometimes, you want your things to be sorted in a specific way without you having to do it yourself. All these can be achieved with the right data structure and with the right application of them.

However, there is one thing that every newbie programmer wonders. What is that one data structure that works everywhere. Or to put it in simpler terms, which is the best data structure ?

You might’ve seen or heard about things called Hash-Map and how it can be used to solve any problem. So is it the best data structure that we can use everywhere ?

Let’s see if it is that way or if there is something that is even better.

Arrays

Arrays are the very first data structure that anyone learns while they learn how to code. Put simply, arrays are continuous chunks of indexed data in the memory. Think of it simply as a shopping list where you have numbered everything and you go through these one at a time.

Arrays are something similar conceptually, however, the only difference is that, their indexing start from 0 (well, that is how the programming community works lol).

So consider the following example:

[1, 2, 3, 4, 5, 6, 7, 8, 9, 0]
Enter fullscreen mode Exit fullscreen mode

Here, the numbers are indexed as follows:

0 -> 1
1 -> 2
2 -> 3
3 -> 4
4 -> 5
5 -> 6
6 -> 7
7 -> 8
8 -> 9
9 -> 0
Enter fullscreen mode Exit fullscreen mode

Here the indexing starts from 0 and goes up to 9 as there are 10 elements in this array.

Also, it is stored in the memory in a similar fashion.

If the memory address of the first element is 2000, and the size of an integer is 4 bytes (in C++), then, the memory address of the 3rd, 5th and the last elements are 2008, 2016 and 2036 respectively.

So arrays can make up for that shopping list of yours pretty easily. However, there is an inherent problem. Let’s say I ask you if 10 is present in this array. You need to go through each element before telling me that it is not. Hence, we say that searching time for an array is O(n) where n is the length of the array. Following are some time complexities for related to arrays:

  1. Adding element: O(1)
  2. Deleting element: O(1)
  3. Searching element: O(n) (Can be extended to O(logn) using algorithms like binary search)
  4. Finding maximum: O(n)
  5. Finding minimum: O(n)

Stacks

Have you ever seen plates being washed ? That is exactly what a stack is.

In a stack, elements are pushed at the top and popped out from the top as well.

That means, you can only see the top most element of the stack and nothing else. Think of this like a pile of coins placed symmetrically. You can see the top coin and only if you remove it, can you get to see the coin below it. A stack is exactly similar to that.

Following are some time complexities for related to stacks:

  1. Adding element: O(1) (only at the top)
  2. Deleting element: O(1) (only from the top)
  3. Searching element: Not possible
  4. Finding maximum: Not possible generally
  5. Finding minimum: Not possible generally
  6. Peek top most element: O(1)

Maximum and minimum elements can be found out in some cases, if the stack is properly arranged, i.e, if elements are always sorted in one way. This can be done by creating monotonic stacks, however, the add operation in these can go up to O(n) time.

Queues

Queues are basically similar to the ones we see in day to day life (pun intended).

Really, a queue works exactly same as a ticket counter line (also called a queue). Elements are pushed from the back and popped from the front. Hence, you cannot really search in a queue, however, adding and deleting can be done pretty fast.

A stack is exactly similar to that.

Following are some time complexities for related to queues:

  1. Adding element: O(1) (only at the front)
  2. Deleting element: O(1) (only from the back)
  3. Searching element: Not possible
  4. Finding maximum: Not possible
  5. Finding minimum: Not possible
  6. Peek front element: O(1)
  7. Peek back element: O(1)

There are queues called dequeue (double ended queues) which can add and remove from any side.

Priority Queues or Heaps

A priority queue is similar to queue, as elements can be removed only to the front. However, They work based on a certain priority. Think of it this way:

You are standing in a ticket counter line at the back. Now, an old man comes up and stands after you. You feel bad, so you let him go before you (so does all other people in the line). So, he goes towards the front unless another old guy is already there in front of him. So, you are prioritising him over your time. This is exactly how a priority queue works.

While pushing to a priority queue, the elements are rearranged so as to maintain the priority order. So, essentially, elements are pushed at the back, but then things are rearranged, so, pushing elements take O(logn) time, where n is the number of elements in the queue.

There are generally two types of priority queues:

  1. Min Heaps - Minimum element at the top
  2. Max Heaps - Minimum element at the top

Following are some time complexities for related to priority queues:

  1. Adding element: O(logn)
  2. Deleting element: O(1)
  3. Searching element: Not possible
  4. Finding maximum: O(1) (for max heap)
  5. Finding minimum: O(1) (for min heap)
  6. Peek front element: O(1)
  7. Peek back element: O(1)

Hashmaps and Hashsets

Here comes the most popular data structure of all: Hashmap.

You might’ve heard this phrase from your seniors: “Whenever you are stuck, throw a hashmap at the problem.”

Well, that is not totally wrong. Hashmaps can greatly optimise time complexity of a problem by using some space. Essentially, you are storing the data in an indexed format, where the index is not an integer, but any other data type (unlike arrays). These are named differently in different languages (maps in C++, hashmaps in java, dictionary in python, etc.), but they essentially, they do the same thing. Hash data to something so that it can be assessed easily afterwards.

There are a lot of ways to hash data, and we won’t be discussing those as the topic is huge in itself. If you are interested to know more about how data is hashed in a hashmap, do put it in the comments, and I’d be more than happy to write up about the same.

Hashsets work in the same way, except that, they have keys only instead of a key-value pair.

Following are some time complexities for related to priority hashmaps and hashsets:

  1. Adding element: O(1)
  2. Deleting element: O(1)
  3. Searching element: O(1)
  4. Finding maximum: Not possible generally
  5. Finding minimum: Not possible generally

Finding maximum and minimum are not possible generally, as hashmaps and hashsets cannot be iterated.

However, some languages (like C++) make this possible by implementing them using arrays.

These are some of the basic data structures that are used in day to day life by any programmer. Thanks for reading it till the end and stay tuned for more amazing content like this one!

Top comments (0)