DEV Community

Khushi
Khushi

Posted on

Topological Sort

What is Topological Sorting?

  • Topological Sorting is a method to organize the vertices of a Directed Acyclic Graph (DAG) in a sequence where, for every directed edge u → v, vertex u precedes vertex v in the sequence.

  • Topological sorting is the act of sequencing “items with prerequisites”
    in an order such that each thing comes only after everything it depends on.

Why only DAGs?
First let's understand ,

1) Why Directed?

  • Dependencies always exhibit a specific direction: If A → B, it means “A must come before B.” in an undirected graph the relationship can be unclear.
  • If A and B are connected, does A depend on B or B depend on A? There’s no way to tell. Therefore, directed edges are necessary to clearly indicate which task is dependent on which.

2)Why Acyclic?

  • A cycle refers to a situation where tasks are mutually dependent on one another in a circular manner.

For instance:
A → B (A precedes B)
B → C (B precedes C)
C → A (C precedes A)
At this point, we're in a stalemate:
A needs to occur before B,
B before C,
C before A.
This presents a paradox → no permissible sequence can be determined.
This is why cycles render topological sorting unachievable.

Real-life example:-

1) Software Development (Compilation Sequence)
During the compilation of extensive software projects, it is crucial for the compiler to adhere to file dependencies.

For instance:

  • Utility.h needs to be compiled prior to Main.cpp (since Main.cpp includes it).

  • Math.cpp relies on Math.h (which means Math.h must be compiled first).
    Neglecting these dependencies will result in compilation errors.

👉 The compiler uses topological sorting internally to determine the appropriate order for compiling files.

2) Project Management (Task Scheduling)
In project tasks, there are often dependencies, similar to nodes in a graph.

Example Project Flow:
Task A: Collect requirements.
Task B: Design system architecture (depends on A).
Task C: Develop code (depends on B).
Task D: Conduct software testing (depends on C).
Task E: Create documentation (depends on B, but is independent of C/D).
In this case, a suitable sequence could be:
A → B → C → D and B → E (allowing E to proceed in parallel after B).

Approaches for Topological Sorting

There are two primary methods:
1.The In-degree Method (BFS) or Kahn's Algorithm
2.The Depth-First Search (DFS) method

Let’s understand Kahn’s Algorithm in detail in this blog.
Core Concept:
Kahn’s Algorithm is a BFS-based technique that builds a topological order by repeatedly removing nodes with no incoming edges (in-degree = 0).

Intuition:

  • Nodes with in-degree = 0 represent tasks with no pending prerequisites → they can be executed immediately.
  • Once such a task is completed, we update the in-degree of its dependent tasks.
  • If any dependent task’s in-degree becomes 0, it is ready to execute next. This process continues until all nodes are processed.

Steps in Kahn’s Algorithm

Step 1: Compute In-Degree of Each Node

In-degree = number of incoming edges to a node.
For every edge A → B, increment in-degree of B by 1.
Store in-degree values in an array or hash map.

Step 2: Initialize a Queue with All In-degree 0 Nodes

These nodes have no dependencies → they can be done right away.
Use a queue (FIFO) to process them.
If multiple nodes have in-degree 0, any order is valid

Step 3: Process Queue Iteratively

While the queue is not empty:
1.Remove a node u from the queue.
2.Append u to the topological order result list.
3.For each outgoing edge (u → v):

  • Decrease the in-degree of v by 1 (since u is processed).
  • If v’s in-degree becomes 0, add it to the queue.

Step 4: Check for Cycles

Following processing, if there are less nodes in the result list than there are nodes in the graph overall, then:
In-degree = 0 was never reached by some nodes.
This indicates that a cycle (circular dependency) exists.Topological sorting is therefore impossible.

Let’s walk through a dry run to better understand how this algorithm works in practice

Let’s say we have 6 tasks (nodes):
0, 1, 2, 3, 4, 5
5 → 2
5 → 0
4 → 0
4 → 1
2 → 3
3 → 1

Step 1: Compute In-degree of Each Node

Count incoming edges for each node:
Node 0: in-degree = 2 (from 5, 4)
Node 1: in-degree = 2 (from 4, 3)
Node 2: in-degree = 1 (from 5)
Node 3: in-degree = 1 (from 2)
Node 4: in-degree = 0
Node 5: in-degree = 0
📌 In-degree array: [2, 2, 1, 1, 0, 0]

Step 2: Initialize Queue with In-degree = 0 nodes

  • Nodes 4 and 5 have in-degree = 0 → enqueue them.

Step 3: Process Queue Iteratively

  • Iteration 1
    • Dequeue 4 → add to result.
    • For edges from 4:
      • 4 → 0: reduce in-degree[0] → 1
      • 4 → 1: reduce in-degree[1] → 1

  • Iteration 2
    • Dequeue 5 → add to result.
    • For edges from 5:
    • 5 → 2: reduce in-degree[2] → 0 → enqueue 2
    • 5 → 0: reduce in-degree[0] → 0 → enqueue 0

  • Iteration 3
    • Dequeue 2 → add to result.
    • For edges from 2:
      • 2 → 3: reduce in-degree[3] → 0 → enqueue 3

  • Iteration 4
    • Dequeue 0 → add to result.
    • Node 0 has no outgoing edges.

  • Iteration 5
    • Dequeue 3 → add to result.
    • For edges from 3:
      • 3 → 1: reduce in-degree[1] → 0 → enqueue 1

  • Iteration 6
    • Dequeue 1 → add to result.
    • Node 1 has no outgoing edges.

Step 4:Check for Cycles
All 6 nodes are processed → no cycle.
✅ Valid topological order found.
[4, 5, 2, 0, 3, 1]
Also note that different valid orders are possible depending on queue processing order.

Code snippet

#include <bits/stdc++.h>
using namespace std;

vector<int> topologicalSortKahn(int V, vector<pair<int,int>> &edges) {
    // Step 1: Build graph and compute in-degree
    vector<vector<int>> graph(V);
    vector<int> in_degree(V, 0);

    for (auto &edge : edges) {
        int u = edge.first, v = edge.second;
        graph[u].push_back(v);
        in_degree[v]++;
    }

    // Step 2: Initialize queue with in-degree = 0 nodes
    queue<int> q;
    for (int i = 0; i < V; i++) {
        if (in_degree[i] == 0)
            q.push(i);
    }

    vector<int> topo_order;

    // Step 3: Process queue
    while (!q.empty()) {
        int node = q.front();
        q.pop();
        topo_order.push_back(node);

        // Reduce in-degree of neighbors
        for (int neighbor : graph[node]) {
            in_degree[neighbor]--;
            if (in_degree[neighbor] == 0)
                q.push(neighbor);
        }
    }

    // Step 4: Check for cycle
    if ((int)topo_order.size() != V) {
        cout << "Cycle detected! Topological sort not possible." << endl;
        return {};
    }

    return topo_order;
}

int main() {
    int V = 6;
    vector<pair<int,int>> edges = {
        {5, 2},
        {5, 0},
        {4, 0},
        {4, 1},
        {2, 3},
        {3, 1}
    };

    vector<int> order = topologicalSortKahn(V, edges);

    cout << "Topological Order: ";
    for (int node : order)
        cout << node << " ";
    cout << endl;

    return 0;
}

Enter fullscreen mode Exit fullscreen mode

Let's analyze the time complexity and space complexity
Each node is enqueued/dequeued exactly once.
Each edge is processed exactly once (when decreasing in-degree).
so time complexity is Time Complexity: O(V + E)
We use in-degree array and queue.
so Space Complexity :O(V)

Kahn’s Algorithm (BFS-based) is iterative, so it doesn’t face recursion depth issues that can occur in the DFS approach.

Top comments (0)