DEV Community

Antoine Coulon
Antoine Coulon

Posted on • Edited on

🔀 Master Directed Graphs by example with JavaScript (introduction)

Introduction to a 3 part series

In mathematics, and more specifically in graph theory, a directed graph is a graph that is made up of a set of vertices (often called nodes) connected by directed edges (often called arcs).

The directed nature of the graph is useful in many cases because it allows us to precisely describe relationships between any vertices of the graph.

You already manipulate Directed Graphs without knowing it

Did you know that you were creating directed graphs whenever an import mechanism is used behind the scenes?


dag

Take for instance the image above with four vertices, each representing a JavaScript file.

Now the question is: what are the relationships between these files? In all programming languages, one file might import one or multiple files. Whenever a file imports another one, an implicit relationship is created.

src/hello.js

  export function sayHello() { }
Enter fullscreen mode Exit fullscreen mode

src/main.js

  import { sayHello } from "hello.js";
Enter fullscreen mode Exit fullscreen mode

As you can see above, main.js imports hello.js to use the sayHello function. The static import creates an implicit relationship between both files.

In the field of graphs, this relationship can be modeled as a directed edge from main.js to hello.js (can be written as main.js ---> hello.js). We say that main.js is adjacent to hello.js but generally speaking, we can allow ourselves to say that main.js depends on hello.js.

Given two vertices A and B, if a directed Edge X starting from A to B we can then say that A is adjacent to B.

We can now update the graph with our edges represented:


dag

Basically this graph says that:

  • FileA directly depends on FileD and FileB
  • FileA indirectly depends on FileC (through both FileD and FileB)
  • FileD directly depends on FileC
  • FileB directly depends on FileC
  • FileC depends on nothing

This structure may seem simple but can in fact be used to model very complex schemas. Let's take a look at three examples of interesting use cases for directed graphs.

1. Static dependencies analysis

=> Cycle dependencies detection by ESLint for JavaScript ESLint no-cycle plugin

dag

Circular dependencies can make your program crash or introduce inconsistencies in many various ways in this is not something you should underestimate. Luckily in Node.js, most famous module systems can resolve dependencies cycles and avoid your program to crash (some module systems do better than others, though).

Nevertheless, circular dependencies are often an indicator that there is a more or less deep misconceptions in your project, so I always advise to resolve circular dependencies.

Further exploring of the circular dependencies detection:

See an implementation of a circular dependency detection using the digraph-js library I wrote

2. Incremental/Affected tasks

=> Bundlers/Monorepos tools make extensive use of it (e.g: NX's affected build/test/lint...)

A directed graph can also be used in order to establish affected patterns. The affected pattern consists in analyzing source code and figuring out what can be affected by every code change.

dag

In the image above, we can see a project using an affected strategy to build only what really needed to be rebuilt. When building the Main App after Library 2 changed, only Library 1 and Library 2 must be rebuilt. The other part of the graph (Library 3, Library 4, Library 5) remains unaffected hence the cached version of these libraries can be used in the final build (no need to rebuild them).

In a monorepo or in custom projects setup, this affected pattern can drastically reduce the time taken by trivial tasks such as build/test/lint.

Further exploring of the Affected pattern:

See an implementation of an affected pattern using the digraph-js library I wrote

3. Task orchestration

In the context of task orchestration/scheduling, directed graphs can also be very precious.

Take for instance a tool that everybody knows: Microsoft Excel. Have you ever wondered how changing the formula from one cell could directly affect other cells depending on this formula? I hope you are not disappointed to learn that it is not an infinite loop running under the hood.

In this context, our Directed Graph has a vertex for each cell to be updated and an edge in between whenever one of them needs to be updated earlier than the other.

Due to the fact that we need to consistently schedule the tasks involved, we can't have cycles in our Graph. Dependency Graphs without circular dependencies form Directed Acyclic Graphs (DAGs).

This acyclic constraint allows us to be consistent with the order of the various operations involved by updates.

In another context of Task Orchestration, we can briefly talk about Task Scheduling, which includes sequential vs parallel patterns.

Thanks to DAGs, we can easily determine what tasks can be run in parallel (no common dependencies in the Graph) and what tasks must be run sequentially (one after the other because one depends on the other).

Similar problems of Task Ordering arise in makefiles for program compilation, in YAML files for CI/CD and instruction scheduling for low-level computer program optimization.

Stay tuned

I'm planning to showcase some use cases introducing the use of graphs using the digraph-js library.

Some examples are already available on my GitHub

Thanks for reading :)

Top comments (0)