DEV Community: Andrei Visoiu

LeetCode Explained: 50. Pow(x, n) - Logarithmic Exponentiation (medium)

Andrei Visoiu — Mon, 26 Jul 2021 13:26:26 +0000

Problem Description

The task of the problem at hand is quite clear: implement an exponentiation function, pow(x, n), that raises x to the power of n (n is an integer). Original link here.

Constraints:

-100.0 < x < 100.0
-2^31 <= n <= 2^31-1
-10^4 <= x^n <= 10^4

Example 1:

input: x = 2.10000, n = 3
output: 9.26100

Example 2:

input: x = 2.00000, n = -2
output: 0.25000

Let's talk about solutions

It is obvious that we can of course multiply the number x by itself n times (and in the case of n being negative, multiply 1/x by itself the corresponding number of times). This approach takes O(n) time and is not the fastest one.

The optimal approach is obtained by making a simple statement:

And then:

We can then calculate x to the power of n by calculating x^2, then x^4 (by multiplying x^2 by itself) and so forth, until we get to x^n.

The algorithm looks like this:

if n is negative, we make x equal 1/x and n equal -n and take a dump variable, a to equal 1; a will help us later.
as long as n is greater than 0:
1. if we find an odd n, we multiply a by x.
2. we multiply x by itself and store the result in x.
3. we divide n by 2 and store the result in n.
when n becomes 0, the result will be in a, because there will always be a case when n will be odd; the motivation for this can come from the prime factorisation of an arbitrary number - 2 is the only even prime factor, and if we continue to divide by it we will get to a point at which there will be no 2s left in the prime factorisation of the number.

For this approach to factorisation, we have to perform log n operations. If n doubles, then the operations performed only grow by 1, as opposed to the first approach, in which the number of operations performed will double if n doubles.

This is a very basic implementation of the algorithm:

This approach is called logarithmic exponentiation and an examples where it proves useful is computing recurrence relations. I provided a more detailed description of the matter in this article I wrote some time ago, which explains how to compute Fibonacci numbers using fast matrix exponentiation.

That concludes today's solution. I'll be back with other LeetCode solutions (including harder ones!) for the LeetCode Explained series, so don't forget to subscribe!

LeetCode explained: July Challenge 2021, week 4 - Partition Array into Disjoint Sets (medium)

Andrei Visoiu — Thu, 22 Jul 2021 12:37:11 +0000

Problem Description

In the July LeetCoding Challenge 2021, week 4, we're tasked with the following (original problem here):

Given an array nums, partition it into two (contiguous) subarrays left and right so that:

every element in left is less than or equal to every element in right.
left and right are non-empty.
left has the smallest possible size.

We have to return the length of left after such a partitioning and the problem has the following notes:

2 <= nums.length <= 30000
0 <= nums[i] <= 10^6
it is guaranteed there is at least one way to partition nums as described.

Example 1:

nums = [5,0,3,8,6]
output: 3
explanation: left = [5, 0, 3] / right = [8, 6]

Example 2:

nums = [1,1,1,0,6,12]
output: 4
explanation: left = [1,1,1,0] / right = [6,12]

Finding a solution

As we are being tasked with finding subarrays, which are contiguous, it is only natural to search for an index in the array to make a "cut" - that is, the index after which array left ends.

We will make use of maximum values in subarrays of the array to devise a solution.

It is a requirement that every element of left has to be smaller than or equal to every element of right, and left has to be as small as possible. Let's think about this a little.

For the first requirement, it would be sufficient to just find the maximum in the whole array and make a "cut" behind it. That is, if the maximum was at index i (suppose we start from 0), then left will contain indexes from 0 through i-1 of nums, and right will contain indexes from i through n-1, if n is the size of array nums.
That would, however, make left as big as possible, not as small as possible as it is required.

To keep the size of left as small as possible, we can think of a scenario where there is an index i, which is not necessarily the maximum of the array, but is the maximum of nums[0..i] and there is no other number in nums[i+1..n] that is bigger than what is left of index i.

The second example provided by the author presents this thing: while 6 is not the maximum value in the array as a whole, it is the maximum of the prefix from 0 through its index (which is 4 here), and has no number smaller than the maximum of nums[0..3] to its right.

The problem then resumes to finding such the number. We can do this in O(n) time complexity and O(1) space complexity.

We will traverse the array and keep two variables, let them be called left_max, which represents the maximum value that will be put in the left array, and last_max, which represents the last maximum values we encountered up until that point.

Initially, we will set both those values to the first element of the array. We will also keep a variable named cut representing the index after we split the array in left and right. So, the array will be cut after the first element.

While looping through the array, there are two types of numbers we can find at one index i:

a number smaller than left_max: in such a case, we need to move the cut, because otherwise we will not obtain the arrays needed; we should keep in mind that before making a cut, nums[i] is to be placed in the array right; so, if we weren't to make a cut, we will not respect the requirements of the problem.
a number bigger than or equal to left_max: in this case, we need to update the value of last_max if needed, so that when we make a cut we will know the maximum in the left array in O(1) time.

Let's see a short Python solution using this approach:

That would be all for this solution. I'll be back with other LeetCode solutions for my new LeetCode Explained series. In the meantime, feel free to comment your solutions in other programming languages here. Or, if you don't feel like it, you may like to read through my other series - The Magic of Computing, discussing other algorithmic topics!

A look into Dynamic Programming - Matrix Chain Multiplication

Andrei Visoiu — Sun, 18 Jul 2021 12:00:01 +0000

In the beginning of the last article I wrote, I described two ways of solving a problem by splitting it into subproblems: on one hand, those problems can be solved independently from one another (a method called divide & conquer, which I described in the article); on the other hand, they can interact with each other, building up on the results. Problems on the latter category can be solved using a method called dynamic programming, which will be the topic for today.

Formal Definition of Dynamic Programming

In the field of Computer Science, Dynamic Programming is derived from a mathematical optimisation method. It refers to simplifying a problem by breaking it down into smaller subproblems. If the results of those smaller subproblems overlap so they can be fit inside the larger problems, then there is a relation between them and the results of the larger problem.

For example, by modifying the BFS algorithm I presented in this article to find the shortest path in an unweighted graph we can obtain a dynamic programming solution to the problem.

This is possible by making a simple statement: if i and j are two nodes in an unweighted graph, then the shortest path from i to j would be obtained by first obtaining the shortest path from i to a neighbour of j. Described in pseudocode:

min_dist[i][j] = infinity
for every neighbour k of j:
   min_dist[i][j] = min(min_dist[i][k]+1, min_dist[i][j])

The last line of the snippet is called a reccurence relation. (such relations are widely used in mathematics; another example is the way the Fibonacci sequence is calculated.

Subproblems and Memoization

Subproblems are basically smaller instances (or versions) of the original problem. By saying that a problem has "overlapping subproblems", we mean that finding its solution involves solving the same subproblem multiple times.

An accessible example is calculating the n-th Fibonacci number, which I presented in an earlier article. Let's look again at the recursion tree of the problem:

It is clear that, if we do not store the results in some way, some numbers will be calculated multiple times, resulting in a staggering time complexity of O(1.62^n) (see the article for information about how this was calculated).
This technique is called "memoization" - we can store the value of a Fibonacci number in an array after we calculate it for later use. This would decrease the time complexity, in this case, to an ~O(n).

Memoization is widely used in dynamic programming (which is, in essence, an optimisation technique). Let us see how we can create such a solution.

Matrix Chain Multiplication

We know that matrix multiplication is not a commutative operation, but it is associative. It also turns out that the order in which the multiplication is done affects the overall number of operations you do.

Let's suppose we have three matrixes:

A, of size 3 x 1 - a column matrix
B, of size 1 x 3 - a line matrix
C, of size 3 x 1 - a column matrix again

We can multiply them in two ways:

(AB)C - multiplying A and B would yield a 3 x 3 matrix, and would take 9 operations. Multiplying (AB) with C would take another 9 operations, for a total of 18 operations.
A(BC) - multiply B and C only takes 3 operations and yield a 1 x 1 matrix. Multiplying A with (BC) would take another 3 operations, for a total of 6 operations.

Keeping that in mind, we ask the question: what is the best order to do the multiplication?

Let's suppose we have N matrixes (M_1 through M_N) whose sizes we store in an array S, such that S[i-1] and S[i] are the sizes for matrix i.

We can solve the problem using dynamic programming by making the following observation: the first thing we need to determine is what multiplication should be done last. In other word, we search for a matrix i such that our expression would look like (M_1 * M_2 * ... M_i ) * ( M_(i+1) * ... M_N), and both the products in parenthesis are also calculated optimally.

We can construct an N x N 2D array, let's call it A, such that A[i][j] will hold the minimum cost (number of operations) to compute the product of matrixes from M_i through M_j. We will use this array to memoise the results.

Let's see how we can calculate the cost for a "cut" in the product of matrixes from M_i through M_j. If we were to put the parenthesis such like (M_i * M_(i+1) * .... M_k) * (M_(k+1) * ... M_j), the cost would be the sum of the cost of the two parenthesis + the cost to multiply the matrix yield by those two, which will be S[i-1] * S[k] * S[j], as the first result would be of size S[i-1] x S[k], and the second would be of size S[k] * S[j].

We now just have to find the best k for our cut. We can make this in a recursive manner. Let us look at an implementation of the idea:

The sizes for the matrixes is the one in the example above, rows and column matrixes. The code outputs the result as 6, as we have concluded earlier.

This was achieved recursively, by first calling matrix_chain_cost for positions 1 through N. We used memoisation to avoid redundant calculations, and then applied the formula we found above.

The time complexity of the code above is O(n^3), as we are basically generating the cost for all the "cuts" we can do in the expression.

That was all for today. The Magic of Computing will be back with yet another interesting algorithmic topic. But, until then, maybe you fancy some Graph Theory? Or are you more of a Backtracking person?

Using divide and conquer: closest pair of points

Andrei Visoiu — Sun, 09 May 2021 18:16:06 +0000

What's a reasonable way to tackle a problem that looks hard to solve? Well, you would need to try to make the problem easier and there are some problems that can be solved by going smaller - splitting the problem multiple times into sub-problems, looking for an answer.

Those sub-problems can either interact with each other (i.e. requiring one of them to be solved before solving another one) or they could be entirely separate from one another, perfectly fitted for solving separately and then combining them.

Both cases describe problems that can be solved using two programming paradigms widely considered to be siblings: dynamic programming, for the first type of problems, and divide and conquer, for the second type. For now, we will look into the second type of problems, with the first one to come in its own, later, article.

Defining Divide and Conquer Formally

Divide and conquer is an algorithm design paradigm which works by recursively breaking down a problem into a number of sub-problems until they become easy enough to be solved directly and then combining their solutions.

In general, three steps can be observed in the algorithms designed using this paradigm:

Divide the problem into smaller sub-problems. (i.e. smaller instances of the problem, like sorting an array of size N/2 instead one of size N)
Conquer the sub-problems, solving them recursively. If a sub-problem has become small enough, solve it directly. (i.e. sort an array of two numbers)
Combine the solutions found directly by moving up the recursive stack. This allows for the sub-problems to pass the solution to their "parent" problems, until getting to the "big" problem. (i.e. merge two sorted arrays)

The examples I have given to each steps sum up to explain the concept of one of the most efficient comparison-based sorting algorithms - Merge Sort, which follows a Divide and Conquer approach.
Quicksort, another efficient sorting algorithms also follows this design paradigm.

However, in this article we will focus on another extremely interesting problem that can be solved using a Divide and Conquer approach. Let's state it.

Consider an euclidean plane containing N points given by their x and y coordinates. Determine the distance between the closest two points in the plane.

We know that the distance between two points, A and B can be expressed as:

From now on, we may denote this distance as d(A, B), for simplicity.

We can devise a straightforward solution - consider every pair of points A and B and calculate their distance. This approach has an O(N^2) time complexity.

Let's try and find another solution by following a Divide and Conquer approach.

In order to do that, we have to consider subsets of points at each step. We have to find the point at which computing the closest two points of such a subset is direct. Let P be a subset of points.
The last subset of P we can split so that we can cheaply calculate its two closest points and avoid searching for pairs in a subset of size 1 is 4. If we go smaller and try to split a subset of 3 into two, we would then have to find the closest pair of points in a one point subset.

(in the lines below, |P| means the size, or cardinal, or P)

Then:

if |P| < 4, consider all |P|/2 pairs and remember the smallest distance.
if |P| >= 4, let us follow the paradigm "recipe":
1. Divide - let's determine a vertical line, let it be called a, which "cuts" our set of points P in two subsets. Let us call them PL and PR, for P left and P right. We need to determine this line such that the subsets are as close in size as possible.
2. Conquer - recursively find the two closest points in PL and PR.
3. Combine - let us assume that the answer to the problem, for set P, is D. Then, D can come from one of the two recursive calls or can be obtained from a point in PL and the other in PR. If we already have a candidate for D from the recursive calls, let it be called d, it is obvious that those two points can only come from a region of length d at most on each side of the line. We then, have to pick the points from PL that are situated at distance d from a at most and the same for PR.

We can find those points at step 3 by building another set, Q, storing the points on either side of line a that are, at most, at distance d from line a. We can then brute-force our way to finding if there is any pair of points that can result in a smaller distance.

For example, considering the set of points I chose above, it is possible to obtain an answer to the problem by looking in the area determined by the red rectangle. Bear in mind that this drawing is not exact, as I have approximated the distances and how the area would look myself.

Now let's look at a Python implementation of the method I presented. We will work with the squares of the distances to improve performance.

That concludes our article for today. The Magic of Computing will be back with another article in which we will discuss the other problem solving paradigm I mentioned in this article - dynamic programming. Until then, you could take a look at some backtracking insights. Or maybe you fancy Graph Theory. Or you're a big fan of the Fibonacci series?

Exploring backtracking

Andrei Visoiu — Sat, 01 May 2021 09:58:36 +0000

Everyone has had at least slight meeting with backtracking throughout their lives, be it unknowingly - for example, as little kids, when navigating mazes for fun, when faced with a turn, say left or right, we will always choose one. However, the choice might not always prove correct - and we will go back, coming at the crossroads again, now choosing the other way, and reaching the exit.

This is exactly the kind of work that backtracking algorithms do. Their aim is to find all (or some) solutions to some constraint satisfaction computational problems:
- decision problems, using it to find a feasible solution.
- optimisation problems, using it to find the best solution.
- enumeration problems, using it to find all the solutions.

Defined formally, backtracking is an algorithmic technique for solving problems recursively, aiming to build a solution incrementally. It removes the candidate solutions that fail to satisfy the constraint as soon as it builds them and backtracks, going to the previous solution, trying to derive other solutions.

Implementing a basic backtracking problem

Let us now take one of the most basic backtracking problems. Let N be a natural number. Generate all the permutations of the numbers from 1 through N in lexicographic order.

For example, let N equal 3. Then, the solutions to our problem will be:
1 2 3
1 3 2
2 1 3
2 3 1
3 1 2
3 2 1

We can clearly see that this is an enumeration problem, asking us to generate all permutations of these numbers.

A straightforward recursive backtracking approach is fairly easy to come up with. At any point in time, our recursive function should know how many numbers it generated until now, and what does numbers are (we can also call that a partial solution). Then, it should follow the next steps:
1. If it has already generated N numbers, stop and print the solution.
2. Loop through all numbers from 1 to N in ascending order. If a certain number i is already added, skip it. If not, add it to the solution and make another recursive call.

We have two ways in which to check if a number i was already added:

loop through our partial solution and check if i is already there.
use an additional boolean array, which will true at index i if the element has already been added, or false otherwise.

Let's look at a basic implementation using the second method to check for added numbers:

Note that, in this example, I passed the arrays to the recursive function at every single call.
However, when constricted by memory limits, another approach might be better suited to our needs.

Let's first talk about why this approach might become problematic in memory-limited scenarios.

The call stack

Recursion uses something called a call stack to store information about the function calls. We can imagine this stack like a stack of boxes, one on top of the other.

Each consecutive recursive call adds up to the stack with all its parameters. So, a function call with arrays as arguments will take up much more space than one with no arguments. And, as our number N increases, the size of our call stack increases. For bigger and bigger values of N and tight memory limits, it might overflow.

In Python, by default, the call stack limit is 1000. This refers to the number of function calls "on top of one another", and does not specify any limit. Although not recommended due to the way Python handles recursion, the limit can be modified using the setrecursionlimit() function from the sys module.

We can, then, make our two arrays, solution and appears global. As you can see, we already made the necessary resets when exiting recursion - in general, when making some operations on variables before recursion (in our case, modify the appears array and add the number to our solution array), and making the exact inverses after coming back from recursion will left the variables as they were before. That makes sense, considering the way in which the call stack is managed, following a First In, First Out approach. The last recursively called function is the first to finish execution, resetting all the variables to their pre-recursion states everytime a function is "popped" from the call stack.

Below is the modified snippet:

In Python, in order to make our global variables valid inside a funtion definition, we have to use the keyword global before accessing them.

A way to visualise backtracking

As you may remember if you have read my previous articles, we had a mini-series of articles running regarding Graph Theory. Well, we can transpose backtracking into a graph theory problem by considering that our search space is a directed graph. For example, let us consider our problem and suppose we have already reached depth 1 (i.e. we have already chosen the first number of our permutation).

The exact steps of our algorithm are showcased in the picture above. Considering it already chose 1, the algorithm will try to put 1 again, dismissing the solution. Then choose 2, which will work, and try to add 1, 2 and 3 to it, of which only 3 will be feasible.

Backtracking can be considered a depth first search on this state graph of our problem.

I hope I have given you some insights regarding backtracking with the article. The Magic of Computing will be back next week with another interesting computational topic, but, until then, why don't you dwelve into some mathematical subjects, say... prime numbers? Or maybe the Fibonacci series is more to your liking.

Why is Graph Theory so amazing? - part 4, working with weights & Dijkstra

Andrei Visoiu — Fri, 16 Apr 2021 16:41:56 +0000

In the ending of the previous article, we said that breadth-first search can be modified to obtain completely different behaviours. Today, we are going to see how exactly we can modify BFS to obtain a brand new algorithm. Firstly though, we need to introduce the weighted graph notion. This is the last articles in this mini series, Why is Graph Theory so amazing?, as next week we're going to explore another interesting computing subject!

Weighted Graphs

A weighted graph is a graph in which each edge has an attached weight (or cost) to it.

For example, the graph above, let it be called G, represents a weighted graph: the edge (2, 5) has weight 10, the edge (1,3) has weight 15, and so on.

From a practical standpoint, a weighted graph can mean a network of roads, with each edge cost representing the distance between two cities, or, when working with live data, a sort of time coefficient that can be affected by traffic events - accidents, maintenance work, and so on.

Now, we can ask the following question: how can we go from a node i to a node j with minimal cost?

Computing The Shortest Path

One popular algorithm to compute the shortest path from a node to another is called Dijkstra's Algorithm, after the Dutch computer scientist (among other things), Edsger Wybe Dijkstra. In an interview given in 2001, Dijkstra said that the algorithm was designed without pen & paper, while sitting in cafe with his fiance.
The algorithm follows a pretty simplistic and elegant approach. We can consider it a sibling of breadth first search, of which I gave a more detailed overview in this article. The main difference is the order in which the algorithm visits the nodes: instead of expanding nodes in the order in which they were put in the queue, in a First-In, First-Out manner, the algorithm remembers, for each node, the cost with which it has been reached, expanding nodes reached with a lower cost first.

Let the starting node be called start, and the node to which we want to find the cost be called finish. Let cost be an array which contains the cost to reach each of the N nodes in our graph. Formally, the algorithm should follow the next steps:

mark all nodes as unvisited; moreover, mark the cost of node start with 0 and the cost to get to any other node with a very high value; put start in the queue.
while there are nodes left in the queue, and we haven't found a cost for the finish node:
- pop a node from the queue, let it be called u.
- if the node has already been visited (i. e., we could get to it from multiple directions, so we put it in the queue multiple times, with different costs), there is no need to visit it again, as we have already expanded the best way to reach it; continue the algorithm.
- if the node hasn't already been visited, search through its neighbours; for any neighbour i, we check whether coming from u can improve its respective cost (i. e., if the current cost to get to i is greater than the cost to get to u plus the cost of the edge (u, i); if so, put i in the queue, with the associated cost.
  - after looking at all the neighbours of u, mark u as visited, and go back to 2..

Now, let us work towards an implementation of Dijkstra's algorithm.

Representing Weighted Graphs in Memory

You may remember the adjacency_list.py file we wrote for the first article. In order for that graph "wrapping" to handle weighted graphs, we need to modify two fundamental things in its structure:

For a node i, every entry in its neighbour list should now remember a tuple: the node which it leads to and the cost of the edge.
The is_valid_tuple function, which we used for checking user input, should be modified to only check the first two elements of the tuple to be nodes; specifically, this line should be modified:

The length of the tuple should be modified to 3, and, as you can see, we used the Python built-in all() function to check for the nodes. We can modify our iterable, x, to only return the first two elements, by changing it to x[0:2]. This Python syntax is called extended slicing. When using it like [start:stop], it returns an iterable object containing elements from index start to index stop-1 of the iterable on which we applied the syntax. Finally, our line will become:

One other thing we have to keep into account is the way in which we extract the result of our lowest-cost path yet, in each step of the algorithm (we consider a graph G with N nodes):

We can use a regular array to construct the queue of our candidates, and look through all its entries every time - that's N entries, for each candidate in our shortest path.
We can optimise the runtime by using an appropiate data structure: a binary heap, with logarithmic minimum extraction and insertion - doing roughly log(N) operations for each shortest path candidate.

Luckily, the Python library contains a PriorityQueue collection, which satisfies our needs for implementing the faster Dijkstra variant, if we are not too interested in implementing a data structure such as this ourselves.

The algorithm I have described below only provides us with the cost of the path, but not with the path itself. However, we can compute it quite simply if, each time we computed a better candidate for a path cost, we would remember the "parent" node, the node which we expanded to get there, in addition to the cost itself. Then, we can use recursion to go from "parent" to "parent", and print the nodes we encounter after making the recursive calls.

Below is the modified version of the graph wrapping, and the implementation of the algorithm:

The snippet outputs the smallest path cost as 7, with the route 1 -> 4 -> 3 when we want to compute the cost between 1 and 3.
Looking at the graph, we can see that this is indeed the answer.

Conclusion

This week's article ends here. This is the last entry in this mini-series, Why is Graph Theory so amazing?, during which I hope I offered a somewhat clear image on what graphs are and some fun ways to use them. Unfortunately, the subject is too broad to be treated in just a series of articles.
Next week, we're gonna present a completely new aspect of Computer Science. Until then, you can pass the time by reading another article in the series!

Why is Graph Theory so amazing? - part 3 - BFS, bipartite graphs

Andrei Visoiu — Sat, 10 Apr 2021 13:48:27 +0000

In the previous article, we looked into one type of graph traversal - depth first search, which works by starting from a root node, and visit every one of its neighbours recursively.

Today, we will explore another type of graph traversal - breadth first search (or BFS, for short).

Breadth-First Search - Analysis & Why It Matters

As opposed to its sibling, BFS is not a recursive algorithm - it will visit nodes in a First In, First Out manner, by using a queue data structure.

The Queue Data Structure

A queue data structure is a sequential collection of entities (we can visualise it as an array, for simplicity, although it can be implemented in more ways) that has 2 fundamental operations:

append an entity to the end of the queue
pop an entity from the start of the queue

If the queue is represented properly in memory, both of these operations have an O(1) time complexity. Naturally, the space complexity of a queue is O(n). As the name suggests, it is very similar to a real-life queue.

Back to the algorithm, it works by first appending the root node to the queue and marking it as visited. Then, as long as there are nodes in the queue, we start popping elements from the front. After popping a node, we put all of its unvisited neighbours in queue, we mark them as visited, and then continue with the popping.

By and large, the algorithm works by following these steps:

let i be the root node of the traversal; append i to the queue and mark it as visited.
while the queue is not empty:
- pop the first element in the queue, let it be j.
- mark j as visited.
- loop through the neighbours of j, push each of them to the queue if they are not yet visited, and mark them as visited.

Analysing the time complexity of BFS yields the same results as with the DFS. Let G be a graph with N nodes and E edges.

if G is represented by using an adjacency matrix, the time complexity of the algorithm is O(N*N)
if G is represented by using adjancency lists, the complexity of the algorihm is O(N+E).

So, again, we see an applicability for each of the graph representation methods.

Now, let's see a Python implementation of the BFS algorithm. We will be using the deque collection from the standard Python library for constant operations on the queue and the adjacency_list.py for the class definition and we will run the algorithm on last week's graph.

The output of the snippet above is "1 2 3 4".

Bipartite Graphs

A bipartite graph, also called a bigraph, is a graph whose nodes can be divided into 2 disjoint sets, let them be U and V, so that every edge in the graph connects a node from U to a node from V.

For example, the graph above is bipartite, with U = (1, 2, 6) and V = (3, 4, 5, 7). Do note that a bipartite graph is not necessarily connected (there is not necessarily a path between any two nodes).

Now, let's see how we can use BFS to determine whether a given graph is bipartite or not. For this, we will try to traverse the graph and construct U and V as we go. For simplicity, we can color the nodes of U with red and the nodes of V with green.

The colouring, and implicitly the set choices, are not unique, as we can either choose to put node 6 in U or in V.

Let's see how we can determine if the first connected component is a bipartite graph. We can start a BFS from 1, and colour it in red. Then, colour all of its neighbours in blue. When looking at an arbitrary node, we can colour its neighbours in the opposite color (i. e. red neighbours if the node is blue, and vice-versa). Applying such a method would construct a part of U and V, made up of only connected component of the graph.

To extend this idea to multiple connected components, we have to run multiple BF searches - we have to loop through all the nodes of the graph and, if we find an uncolored one, we have to start a traversal from there.

Let us see a Python snippet of how we can do this.

The snippet above outputs U as (1, 2, 6) and V as (3, 4, 5, 7), and concludes that the graph in the example is indeed bipartite.

Bipartite graphs can model a wide range of problems. One very practical exampling is class scheduling. If we were to take two sets, U, made up of students, and V made up of the classes they can take, we can use this bipartite graph modelling to determine which classes should not happen at the same time.

Conclusion

This week, we looked a little into what breadth-first search can do. However, its applications do not stop at bipartite graph checks - it can also be used to determine the shortest path from a node to all the others (which can be useful when trying to send packets through a network), or modified further to give the algorithm completely new behaviours.
We will continue the series next week - until then, do delight yourself with some other articles in The Magic of Computing series! Maybe you are into Fibonacci numbers, or you wanna brush up your knowledge about prime numbers?

Why is Graph Theory so amazing? - part 2, depth first search & topological sorting

Andrei Visoiu — Thu, 01 Apr 2021 10:59:45 +0000

In the previous article, we explored the definition of a graph and gave some brief examples of how they are represented in computer memory. Today's article will focus on some basic graph algorithms showcasing graph traversals.

Graph traversal is the process of visiting each node of a graph. Such traversals are classified into two categories, by the order in which the nodes are visited: depth-first search (at which we will look today) and breadth-first search (for next week!). For the rest of the article, let G be a graph with N nodes and E edges.

Before talking about traversals themselves, there is some graph terminology we have to introduce.

The degree of a node

Below, I will be using the notion of degree when speaking of a node. The degree of a node represents the number of direct connections that node has to other nodes in the graphs.

For example, in the image above, the degree of node 1 is 3, the degree of nodes 4 and 2 is 2 and degree of node 3 is 1.

In an undirected graph, the sum of the degree of each node equals half the number of edges (as for any nodes i and j such there is an edge between them, the edge (i, j) is counted as both (i, j) and (j, i))
When speaking about directed graphs, for any node i there exists an in degree (number of edges that lead to i) and an out degree (number of edges that leave from i).

Walks, trails, cycles in a graph.

A walk in a graph is a sequence of nodes in a graph such that for any two consecutive nodes, i and j, there exists the edge (i, j). This implies that nodes and edges can be repeated. For example, in the graph we presented above, 1 -> 3 -> 1 -> 2 -> 4 is a walk.
A trail in a graph is a walk with no repeated edges. For example, in the graph above, 1 -> 2 -> 4 or 2 -> 4 -> 1 -> 3 are trails.
A cycle in a graph is a trail in which only the first and last nodes are repeated. In our graph, 1 -> 2 -> 4 -> 1 is a cycle.

Depth-first search - analysis & why it matters

The depth-first search algorithms follows a recursive approach and aims to traverse the graph in its depth, by following these steps:

start the traversal from an arbitrary node, let it be i; mark i as visited.
search for the neighbours of i.
if a neighbour, let it be j, is found, then the algorithm makes a recursive call to step 1, now starting from node j instead of i.
if there are no neighbours to be found, the call ends and the algorithm moves up the recursive stack.

Applying this recursive approach to a node i (finding its neighbours and making correspondent recursive calls) can also be called, for short, expanding i.

Now, let's have a look at the time complexity of the DFS algorithm when representing G using each of the methods showcased in the previous article:

G is represented by using an adjacency matrix, let it be A of size N * N. If we have to search for the neighbours of an arbitrary node i, we have to loop through all the N columns of i-th line of the matrix. We have to do this exactly N times as we can't expand a note multiple times and therefore the complexity of DFS while using an adjacency matrix is O(N*N).
G is represented by using adjacency lists, let A[i] denote the list of the neighbours of i. Then, for each node i, we have to loop through its neighbours - equal in number to the degree of node i. Adding those operations result in an O(N+E).

As we said in the previous article - for a dense graph (that is, when E approaches N*N) the adjacency matrix approach would prove more effective as it would use considerably less memory, while having the same asymptotic complexity - when E approaches N*N, O(N+E) is O(N*N).

Let us now see an implementation of DFS, using a graph represented by adjacency lists. To avoid duplicate code, during this series, I will use the adjacency_list.py I wrote for the first article, which includes a basic class definition for managing graphs represented by adjacency lists.

This snippet runs on the graph used above when explaining node degree and its output is "1 2 4 3".
The algorithm starts from 1, searching for its neighbours (which are 2, 3, 4). The first one found is 2, and then the algorithm searches for 2's unvisited neighbours, finding 4. Node 4 has no unvisited neighbours left, and the algorithm moves up the stack. Same goes for 2. Node 3 is the last of node 1's unvisited neighbours, resulting in the output.

Topological sorting

If G is a directed graph with no cycles (also called a directed acyclic graph, or DAG for short), then the topological sorting of the nodes of graph G is a linear ordering of nodes such that for any edge (i, j), i comes before j in the ordering.
This sorting has a wide range of real world uses, most notably helping organize a schedule of tasks based on their dependencies.

For example, the graph above can represent dependencies between tasks (e. g., task 2 should be started after finishing task 1). By find a topological sorting of the graph, we can determine a way in which to start the tasks so there are no bottlenecks.

A simple way to do this is to first find a node with in degree 0 (in our case, node 1), and start a depth first search with it as root. Instead of using the node right away, we will use a stack to push it only after all of its neighbours have been visited. Then, the reversed stack will hold a topological sorting for our graph, as the last node we push to the stack will be the root.

For implementing topological sorting, I have modified the graph class we used before for a directed graph and also calculated the in degree of each node at initialisation:

Then, I implemented the modified DFS using the modified version of the graph manager class as the parent:

For the directed graph above, the output of the snippet is "1 4 2 3 5", an ordering that satisfies the conditions of our definition. Note, though, that a topological sorting of a graph is not necessarily unique. For our example, "1 2 3 5 4" is also a correct ordering.

Conclusion

This was a brief introduction into depth first search and one of its real world applications. Next week, we will be talking about the other graph traversal - breadth first search (or BFS for short), and we will also showcase some of its powers.
Until then, stay tuned, any why don't delight yourself with some of the other articles in The Magic of Computing series? The part about the Fibonacci numbers is quite a hit!

Why is Graph Theory so amazing? - part 1

Andrei Visoiu — Thu, 25 Mar 2021 12:16:07 +0000

Graph Theory has a special influence in our daily lives. Unbeknown to most people, many aspects of our day-to-day life are modelled by graphs: the GPS we use every day, our Facebook friend suggestions, and even the web and the operating systems we use.
How can a concept that looks so simple - models describing relations between objects - become so powerful?
We are going to try and present some of the reasons behind this and explain some very interesting properties and facts about Graph Theory during a new entry in The Magic of Computing series - "Why is Graph Theory so amazing?"

A little bit of history

The idea of a graph was firstly introduced by a very influential Swiss mathematician by the name of Leonhard Euler. During the first half of the 18th century, Euler made attempts to solve the famous Königsberg bridge problem (the city is in Russia and is now called Kalinigrad), eventually suceeding in 1735.

(image courtesy of Encyclopedia Britannica, Inc)

In the photo above we can see a visual representation of how the bridges the problem speaks about were set. The question was whether a citizen could cross the bridges in such a way that each bridges was crossed exactly once. Nowadays, this kind of traversal is called an Eulerian path.

Euler demonstrated that such a path cannot be achieved in this setting. He firstly supposed that a path existed. When attempting a traversal, each time a citizen encounters a landmass, apart from the start and finish encounters, two bridges should be accounted for - the one on which the citizen just passed, and another one to take him to another landmass, so the number of bridges on each landmass should be an even number. However, the picture shows that there are no landmasses with an even number of bridges. Therefore, such a traversal is impossible in the current setting.

Although Euler was one of the first to unknowingly experience with what was to become graph theory, graphs were only formally defined after 1870. In the formal definition, a graph is described by two sets. One set, called V, containing vertices (or nodes), and E, a set edges - pairs of vertices which can be unordered (resulting in an undirected graph), or ordered (resulting in a directed graph).

Graphs in the modern context

Graph theory became exponentially more useful with the release of consumer computers. The concept is simple enough to easily represent in code and memory, and people even came up with various methods that aim to optimise how graphs are handled. Although the other entries in the series contain C++ snippets, handling graphs in C++ needs more code and I think it would make the article harder to read.

Let G be a graph with N nodes and E edges. Below, we will showcase two different methods to create a Python class to handle this graph, using two different forms of graph representation. Both representations will be of undirected graphs.

Adjacency matrix

A handy mode to represent G is by using a square boolean matrix of size N * N. Let this matrix be called A. Then, A[i][j] will be true if there is an edge between nodes i and j and false otherwise. In an undirected graph, matrix A is symmetric, as the edge from i to j is bidirectional, so A[i][j] = A[i][j] for any i and j.

One advantage of this is that determining whether an edge (i, j) is in the graph can be done in constant time. However, if we aim for example to store a graph with a lot of nodes and a relatively small number of edge, this method might prove inefficient from a memory standpoint (this graphs are also called sparse graphs). That leads us to another popular method of representing graphs.

Adjacency lists

This method aims to represent a graph by using N lists, enumerating the neighbours of each node.

While when using this method, checking whether a given edge exists is not constant (having an O(E) worst case), looping through all the neighbours of a node is much time-efficient in sparse graphs, and there are also other benefits which we will see in future articles.

Other methods

Alternatively, graphs can also be represented by using a list of edges or a so-called incidence matrix.
The incidence matrix is a boolean E * N matrix, with one line for each edge. The columns representing the two extremities of the edge are marked with 1 on its corresponding line.

All methods have their own advantages and disadvantages as we shall see in other articles.

In the next article, we will delve deeper into graph theory, showing some interesting graph-related algorithms and their applications. Until then, be sure to check out other articles in The Magic of Computing!

Prime numbers: Fast and Slow - part 3

Andrei Visoiu — Tue, 19 Jan 2021 19:30:32 +0000

Hello everyone! I am back with another entry in my Magic of Computing series. Today, we are going to have one last look at prime numbers. During the previous article, we learned about an influential figure in the world of science, Eratosthenes of Cyrene, and today we are going to see how a prime factorisation algorithm performs, use some of Erastothenes' findings to improve it, and explain why we care about prime factorisation in the first place.

What is prime factorisation?

We know that all non-prime (or composite) are a product of primes. Prime factorisation of a number n is finding what primes multiply together to make up n and how many times.
For example:

Computing a prime factorisation

We can compute the prime factorisation of a number using a fairly straightforward approach. Let n be the number of which we want to compute the prime factorisation. All we have to do is search all its possible divisors, much like the approach I presented during the first article about prime numbers, with a slight modification: when fiding a divisor, we will divide n by it as many times as possible. This will guarantee us that we will only divide n by its prime factors: let us suppose we found a composite divisor d of n; then, any number in the prime factorisation of d (implicitly smaller than or equal to d) is also in the prime factorisation of n, which can only be true for d, as all its prime factors should have already been handled. That would imply that d is prime, resulting in a contradiction.

We can skip even numbers as they will all be multiples of 2, which we will test separately, resulting in the following implementation:

void primeFactorisation(int n) { /// suppose n is >= 0
    if(n <= 1) cout << n << ' ';
    else {
        /// firstly, we will test n separately.
        int p = 0; /// p will store the power of each number in the prime factorisation of n.
        if(n % 2 == 0) {
            while(n % 2 == 0)  {
                n /= 2;
                p++;
            }
            cout << "2^" << p << "; ";
        }

        /// now we will search the possible divisors of n for prime factors.
        for(int d = 3; d <= n; d += 2) {
            if(n % d == 0) {
                p = 0;
                while(n % d == 0) {
                    n /= d;
                    p++;
                }
                cout << d << "^" << p << "; ";
            }
        }
    }
}

It is obvious that, for large numbers, a vast amount of non-prime numbers would be tested as well. If we were to compute multiple queries asking for the prime factorisation of n, we could use The Sieve of Eratosthenes, described in the previous article to precompute the primes up to a given limit and then, instead of searching for the possible divisors of n for the prime factors, we would only search for divisors of n across the precomputed prime numbers.

Why computing prime factorisations interest us

In my first article about primes I mentioned that [RSA](https://en.wikipedia.org/wiki/RSA_(cryptosystem), the most common encryption tool today makes extensive use of prime factorisation by using a private key which is the product of two big prime numbers. In theory, you can crack any RSA private key if you ran a modified version of the algorithm I wrote above. In practice, you do not have the time to do that. The time taken to compute the prime factorisation of a sufficiently large number is non-polynomial, and there is no known polynomial algorithm (running on a classic computer) to compute this prime factorisation.
The latest RSA standard to be factored is RSA-240 in November 2019. For reference, computing this on a 2.1 GHz Intel Xeon Gold 6130 would take 900 years.

That's it for Primes: Fast and Slow. I will be updating this series of articles (and some others!) with various computer science stuff regularly, so feel free to follow me if you haven't already to keep in touch with it.
If you liked this article, you may like other articles I have written:

Divide pizzas with a Greedy approach & Python

Andrei Visoiu ・ Nov 5 '19 ・ 2 min read

#computerscience #python #algorithms #greedy

BookBoom 1: AI to the mainstream - "Machines that think" (Instant Expert by NewScientist)

Andrei Visoiu ・ Feb 8 '20 ・ 3 min read

#artificialintelligence #ai #machinelearning #books

Prime Numbers: Fast and Slow - part 2

Andrei Visoiu — Sun, 10 Jan 2021 15:02:54 +0000

Hello everyone! It has been a while since I have written a blog post on dev.to - I lacked the motivation to do it for quite some time, but now I intend to pick up where I left on my series, The Magic of Computing, more specifically prime numbers. During the previous article, we explored what prime numbers are, some of their basic uses in computer science and some basic algorithms related to them. This article will focus on generating lists of prime numbers.

An influential figure - Eratosthenes of Cyrene

Living for the most part of the 3rd century BC, Eratosthenes of Cyrene was a mathematician who is now considered the Founder of Geography. He was the chief librarian at the Library of Alexandria and the first person to compute an astonishignly accurate calculation of Earth's circumference, in approximately ~240BC, by comparing the positions of the Sun's rays in two locations. His calculations were only several kilometers off. More information on this can be found here
Apart from measuring the Earth, he also introduced an algorithm now considered ancient, but which is still used nowadays in optimised variations. It is called The Sieve of Eratosthenes and is used for finding prime numbers up to any given limit.

The Sieve of Eratosthenes - a simple, yet ingenious approach to generate prime numbers

The algorithm is based on a very simple idea - prime numbers cannot be the multiple of any prime number before them. Let the given limit for our algorithm be called n. Firstly, it considers all numbers, starting from 2, to be prime. Then, incrementally, for every number i from 2 up to n, it checks whether i is prime:

if i is prime, the algorithm marks all the multiples of i lower than or equal to n as non-prime, or composite.
if i is not prime, the algorithm continues. The algorithm ends when i has reached n.

Straightforward implementation

Below, you can find a basic C++ implementation of the algorithm, using it to print prime numbers up to a number n.

void sieve(int n) {
    bool prime[n+1]; /// we need the numbers between 1 and n
    for(int i = 0; i <= n; i++)
        prime[i] = 1; /// we consider all number from 0 up to n to be prime

    for(int i = 2; i <= n; i++)
        if(prime[i])
            for(int j = 2*i; j <= n; j += i) /// we take all the multiples of i up to n and mark them as non-rpime
                prime[j] = 0;
    /// while printing, we don't consider 0 and 1 as primes.
    for(int i = 2; i <= n; i++)
        if(prime[i])
            cout << i << ' ';
}

Optimisations

An immediate optimisation comes from something also mentioned in the other article as well: we only need to check up to the square root of n, because any divisor greater than that would have had a number to pair with in order to get n by multiplying them. Also, when checking multiples, it is also worth observing that all the numbers less than i*i have already been marked when checking the primes smaller than i.
The implementation with the basic optimisations looks like this:

void sieve(int n) {
    bool prime[n+1]; /// we need the numbers between 1 and n
    for(int i = 0; i <= n; i++)
        prime[i] = 1; /// we consider all number from 0 up to n to be prime

    for(int i = 2; i*i <= n; i++)
        if(prime[i])
            for(int j = i*i; j <= n; j += i) /// we take all the multiples of i up to n and mark them as non-rpime
                prime[j] = 0;
    /// while printing, we don't consider 0 and 1 as primes.
    for(int i = 2; i <= n; i++)
        if(prime[i])
            cout << i << ' ';
}

When dealing with a big n, the sieve can be improved further by only checking odd numbers and only using bits to store the primality (for example, by using bitset). A great article on these optimisations can be found here.
The usage of sieves, such as The Sieve of Eratosthenes, has spawned a new area of expertise, called Sieve Theory.

The article has come to and end. The next article will be the last in which we will explore prime numbers and will be centered upon prime factorisation, so stay tuned!

Prime Numbers: Fast and Slow - part 1

Andrei Visoiu — Sat, 22 Feb 2020 08:24:48 +0000

A prime number (or commonly a prime) is a natural number greater than 1 which has exactly two natural divisors (1 and itself). They are widely studied by number theorists and a fairly in competitive programming problems and interview questions. But why?

Well, during 1-2 articles, we will be discussing some interesting uses, properties and means of computing primes that will help us understand why they are popular.

They are used for calculating hash codes

Firstly, a hash table is a data structure that maps values to keys. Hash codes are number codes, computed through a function called a hash function, assigned to complex objects (which are the keys) in order to quickly retrieve them from a hash table. Ideally, a hash function should be injective (no two objects should return the same value). However, in practice, this is a fairly hard task. This is where primes come into play. For example, the widely used function to compute a hash of a string is the following:

Where m and p are some chosen positive integers. This is called a polynomial rolling hash function.
Reasonably, p should be a prime number around the number of characters in the alphabet and m should be a large number (in practice, it should also be prime)

Modern cryptography requires extensive use of prime numbers

Prime numbers are the fundamental tool that the most common type of encryption used today, RSA, uses.
It depends on the fact that the prime factorization of large numbers takes a lot of time: there exists a public key (the product of two large prime numbers) and a secret key (those two large prime numbers). Everyone can use the public key to encrypt messages to send you, but only you can decrypt the message. Anyone else should find the prime factorization of the public key (which right now takes an unreasonable amount of time).

Algorithms related to prime numbers

There are various algorithms related to prime numbers. We will study some approaches and analyse their performance in order to understand them.

Checking whether a number is prime or not

The first algorithms we will look into are those that test for primality. Even though prime factorization is thought to be a computationally difficult problem, primality testing is not, as we will see below.

Naive method

The most naive solution is to use the definition in order to get a correct algorithm:

bool primeCheck(int n) {
    if(n <= 1) return 0;

    for(int i = 2; i < n; i++) 
        if(n % i == 0) /// if i is a divisor of n 
            return 0;

    return 1; /// we haven't found any divisor
}

Without much analysis, we can clearly see that the complexity of the algorithm is O(n), because it checks every number i between 2 and n-1 to see if it divides n.

A rapid, small optimisation can quickly arise if we pay attention to a small observation: any number n won't have any divisors greater than n/2:

bool primeCheck(int n) {
    if(n <= 1) return 0;

    for(int i = 2; i <= n/2; i++) 
        if(n % i == 0) /// if i is a divisor of n 
            return 0;

    return 1; /// we haven't found any divisor
}

The complexity is still O(n), but the runtime should be a little lower than the one of the first version.

O(n) is the worst time complexity we can get and is still polynomial in the size of the input. That is why, as I said above, primality testing is not considered computationally difficult.

Further improvements over the naive method.

A property of the divisibility relation is that if i is a divisor of n, then n/i is a divisor of n. By taking this into account, we can make another optimisation: we only need to check up to the square root of n, because any divisor greater than that would have had a number to pair with in order to get n by multiplying them.
Another improvement we can make is by using another observation: if n is not divisible by 2, then it won't be divisible by any even number. So, we can check if 2 is a divisor of n separately and then only look at odd numbers. With the improvements, the algorithm should look like this:

bool improvedPrimeCheck(int n) {
    if(n <= 1) return 0;
    if(n % 2 == 0) return 0;

    for(int i = 3; i*i <= n; i++)
        if(n % i == 0)
            return 0;

    return 1;
}

The time complexity of the algorithm above has now become O(sqrt(n)).

That's it for this article. During the next article, we will be taking a look into generating prime numbers and prime factorisation, so stay tuned!