# Topological Sort with Kahn's Algorithm

On a recent project, we were working on a spreadsheet-like tool where users could cross-reference different rows in formula, just like in Excel. We ran into the question of how to order the evaluation of the rows such that the dependencies do not cause any conflict.

For example, imagine we have three rows:

Row id Formula Rendered values
`row1` `manual input` 1, 2, 3
`row2` `row3` * 2 6, 8, 6
`row3` `row1` + 1 2, 3, 4

`row1` can be rendered directly, as its formula does not depend on any other rows. `row2` however, references `row3`. `row3` references `row1`. Thus in order to render `row2`, we first need to render `row3`.

It turns out, this is quite a common problem, for example in resolving the import order of different modules in a project.

We can solve these types of problems using topological sorting, which takes a DAG (directed acyclical graph) and orders its nodes in a way that for every edge `x -> y`, `x` comes before `y` in the ordering.

Kahn's algorithm is one algorithm that can establish this topological order.

## Kahn's Algorithm: Theory

The idea behind Kahn's algorithm is simple:

1. Put all nodes that have no dependencies in your order
2. Remove all occurrences of those nodes from the dependencies of all remaining nodes. If there are no cycles, this will give you new nodes without dependencies.
3. Repeat until there are no more nodes left.

For example:

Node Depends on
`a` `b`, `c`
`b` `d`
`c` `d`
`d` -

Here's the dependency graph (where `x` -> `y` means that `x` depends on `y`.

Thus:

1. `d` has no dependencies. Put it in the order.
Order = [`d`]

2. Remove occurrences of `d` from the other dependencies. `d` occurs in the dependencies of `b`, `c`. Thus, after removing `d` from those dependencies, neither `b` nor `c` have any dependencies. 1. `b` and `c` have no dependencies. Put then in the order.
Order = [`d`, `b`, `c`]

2. Remove occurrences of `b` and `c` from the other dependencies. Both `b` and `c` occur in the dependencies of `a`. Thus, after removing `b` and `c` from those dependencies, `a` has no more dependencies.

1. `a` has no dependencies. Put it in the order. Order = [`d`, `b`, `c`, `a`] 1. No more nodes left. Done. Final order = [`d`, `b`, `c`, `a`]

## Kahn's Algorithm: Implementation

For our implementation we assume as input a dictionary where the keys are the node names and the values are a list of all nodes that the depend on the key.

For example, if nodes `B` and `C` depend on `A`, then on entry in the dictionary would be `A: [B, C]`.

Translating the theory from the previous section into Python code, a possible implementation looks like this:

``````def topological_sort(nodes: dict[str, list[str]]) -> list[str]:
"""
Topological sort for a network of nodes

nodes = {"A": ["B", "C"], "B": [], "C": ["B"]}
topological_sort(nodes)
# ["A", "C", "B"]

:param nodes: Nodes of the network
:return: nodes in topological order
"""

# Calculate the indegree for each node
indegrees = {k: 0 for k in nodes.keys()}
for name, dependencies in nodes.items():
for dependency in dependencies:
indegrees[dependency] += 1

# Place all elements with indegree 0 in queue
queue = [k for k in nodes.keys() if indegrees[k] == 0]

final_order = []

# Continue until all nodes have been dealt with
while len(queue) > 0:

# node of current iteration is the first one from the queue
curr = queue.pop(0)
final_order.append(curr)

# remove the current node from other dependencies
for dependency in nodes[curr]:
indegrees[dependency] -= 1

if indegrees[dependency] == 0:
queue.append(dependency)

# check for circular dependencies
if len(final_order) != len(nodes):
raise Exception("Circular dependency found.")

return final_order

``````

Let's try this with a slightly more evolved network: The corresponding input looks like this:

``````nodes = {
"A": ["B", "C"],
"B": ["C", "D"],
"C": ["F"],
"D": ["F"],
"E": ["F"],
"F": []
}
``````

If we run this we get the following order:

``````order = topological_sort(nodes)
print(order)

# ['A', 'E', 'B', 'C', 'D', 'F']
``````

Take another look at the graph and convince yourself that that's indeed a correct order that avoids dependency clashes!