Eric Magesho

Posted on Feb 21, 2022

INTRODUCTION TO DATA STRUCTURES AND ALGORITHMS IN PYTHON

#webdev #python #100daysofcode #beginners

Data structures and algorithms are one of the most common and tricky interview questions in most software development roles. Most beginner developers, especially those who haven't been in a computer science class, always fail terribly. In this tutorial, we will discuss this very important topic at a high level and try as much as possible to make it beginner-friendly. Roll your sleeves up and let's dive right in.

DATA STRUCTURES

Let's assume you are building a house. You will need to have the raw materials like wood, cement, blocks, metal etc and apply the right building procedures for you to end up with a complete house. The same happens when building software, You will need the building blocks and specific instructions(or algorithms) to come up with a better end product. Data structures are the building blocks of software. They are used to organize, store and manage data for efficient access and manipulation. To be a better programmer you have to have a sound knowledge of data structures and algorithms so that you use the right tool for a specific problem. This will enable you to save time, write better code more efficiently.

There are two major categories of data structures in Python:

Built-in data structures eg

Lists
Dictionaries
Tuples
Sets.

2.User-defined data structures eg
Linked-list

Queues
Stack
Graphs
Heaps.
We shall discuss the most common data structures and how they are implemented in python.
But before we discuss each of these, let's talk briefly about a very important but related subject:

Big O notation

In its simplest term, The Big O notation is used to measure how running time or space requirements for your program grow as the input size grows.
In your projects when your solution requires a particular algorithm, it is important to understand how fast or slow it is compared to other options.
consider below function that takes a list of integers and returns the sum:

def find_total(items):
    sum = 0
    for item in items:
        sum += item
    return sum

When you call the function with a list containing 4 items, the run time is far much less than when you call in with a list containing a billion items. Irrespective of whether you are running the function on a faster machine or not, the runtime can be represented as a linear function like so:
time = an + b
When calculating the Big O, the rule of the thumb is to keep the fastest growing term and drop any constants. Hence
time = O(n)
It's important to note that Big O establishes the worst-case run-time.
So when choosing what data structure and algorithm to use in your implementations you need to consider the runtime implications.Use the right tool for a specific problem.
You can read more about Big O here:[(https://www.freecodecamp.org/news/big-o-notation-explained-with-examples/)]

Lists

In python, the list data structure is implemented as a dynamic array. It's a general-purpose and mostly used data structure. You can create a list using either of the three methods below:

list() - Using the constructor.
my_list = [] - literal.
[item for item in range(10)] - using list comprehension.

You can access items of a list by using the [] bracket notation eg my_list[0].
Looking up an item by index takes O(1), and looking by value takes O(n).
check out this tutorial for a comprehensive discussion of Python lists [(https://www.geeksforgeeks.org/python-list/)]

Dictionaries

A Dictionary is a collection of unordered data values consisting of Key: Value pairs. The keys in a dictionary are immutable.
You can create a dictionary in python by placing a sequence of keys and values separated by commas in curly braces {}.You can also create a dictionary by using the built-in dict() method.

# Creating an empty Dictionary
My_dict = {}
output
{}

# Creating a Dictionary with dict() method
my_dict = dict({1: 'apples', 2: 'oranges', 3:'Mangoes'})
output
{1:'apples',2:'oranges',3:'Mangoes'}

You can use the [] notation to access an element in a dictionary using a key eg my_dict[1].There is also a get() method you can use to access an element in a dictionary.
You can traverse the dictionary using a for-loop like below:

#using the keys
for key in my_dict.keys():
    print(key)
#using the values
for value in my_dict.values():
    print(value)
#Using both key,value pairs
for key,value in my_dict.items():
    print(key.value)

For an extensive discussion of python dictionary please check out this link [(https://www.geeksforgeeks.org/python-dictionary/)]

Tuples

A tuple is similar to a list as both are of sequence type, but it's immutable. It's useful for fixed data and is faster than a list.
You can create an empty tuple as below:
empty_tuple = ().
All other operations of the sequence type in python apply to tuples.

Sets

A set is an unordered collection of items, used to store non-duplicate items. The items themselves are immutable but the set can be mutable. Sets provide fast access to elements than lists.
Sets can also be used to perform mathematical set operations like union, intersection, symmetric difference, etc.
A set is created by placing all the items (elements) inside curly braces {}, separated by a comma, or by using the built-in set() function.ie my_set = set() or my_set = {2,4,6}.
A set can have any number of items, of different types(integers, floats, strings etc) but cannot have mutable types eg lists, dictionaries.
For a complete tutorial on python sets, you can check out [(https://www.programiz.com/python-programming/set)].

Linked-lists

As we have mentioned above, lists are implemented as dynamic arrays in python. This proves to be inefficient when doing operations on lists eg adding items. Because the list has to be copied to different memory locations and this adds an overhead when the list is big enough. This is what the linked list is trying to solve. While lists use a contiguous memory block to store references to their data, linked lists store references as part of their elements.
Every linked list is composed of nodes-every node has two parts: data and a pointer to the next node.
Linked lists can be used to implement queues and stacks as well as graphs.
Check out this tutorial for a complete discussion of linked lists including examples of implementation in python:
[(https://www.youtube.com/watch?v=Bd1L64clh34)]

Queues and Stacks

Queues and stacks differ only in the way elements are retrieved. For a queue, you use a First-In/First-Out (FIFO) approach. That means that the first element inserted in the list is the first one to be retrieved.
For a stack, you use a Last-In/Fist-Out (LIFO) approach, meaning that the last element inserted in the list is the first to be retrieved:
This Tutorial[(https://realpython.com/linked-lists-python/#understanding-linked-lists)] explains these data structures in depth.

ALGORITHMS

An algorithm is a finite sequence of well-defined instructions, typically used to solve a class of specific problems or to perform a computation.
There are different classes of algorithms including:

Divide and Conquer algorithms
Dynamic programming algorithms
Greedy algorithms
Randomized algorithms
And many others.

Below are elements of a good algorithm:

Steps should be finite, clear and understandable.
There should be a clear description of input and output.
Each step must have a defined output that depends only on inputs in that step or the preceding steps.
It must be flexible.

check out this comprehensive course on freecodecamp for a well-illustrated discussion of popular algorithms and their implementations in python:
[(https://www.youtube.com/results?search_query=algorithms+python)]

Conclusion

Data structures and algorithms is a huge topic and require a lot of time to study(usually a whole semester of work). The links in this tutorial provide comprehensive discussions on each topic. Feel free to check them out. Also please suggest more resources in the comments section below so that we can learn together.

Happy coding.

DEV Community