# A Gentle Introduction To Graph Theory

### Vaidehi Joshi Mar 21 '17

So many things in the world would have never come into existence if there hadnโt been a problem that needed solving. This truth applies to everything, but *boy, is it obvious* in the world of computer science.

Someone needed a way of keeping track of the order of things, so they played around with and created different data structures until they found the one that worked the best for the specific problem that they were trying to solve. Someone else needed a good way of storing data, so they played around with different number systems until they found one that worked best for the kind of information that they wanted to contain. People needed a good way of labeling and processing tasks, so they found a way to build upon the tools they had and created a way to juggle all the things that one single system needed to do, at any given time.

Of course, computer science isnโt the *only* field to innovate and build upon what came before it, but I do think that itโs unique in one way: *computer scienceโs innovations rely and build upon its own abstractions.*

Iโve talked about abstractions a whole lot in this series, because ultimately, thatโs what this series is *about*: finding the joy in the abstractions that lie beneath the things that all of us use, every single day. And, for what itโs worth, when I say โusโ, Iโm only partially talking about *us* as programmers, the producers of technologyโโโI also mean *us* as users, the consumers of technology.

So, which amazing abstraction shall we learn about next? Well, now that weโre experts in tree data structures, it only seems right to understand where trees came from. Trees are actually a subset of something you might have already heard about: **graphs**. But in order to truly know why we use graphs and what they are, weโll need to go deep down to the very roots of something that stems from discrete mathematics: **graph theory**.

If this is your very first foray into discrete math, fear notโโโitโs mine, too! Letโs tackle it togetherโโโand try not to lose our sanity in the process.

### Looseyโgoosey graphs

When we first started looking at non-linear structures, we learned about their most fundamental characteristic: that their data doesnโt follow an orderโโโat least, not an obvious numerical one, like we see in arrays or linked lists. Trees, as we learned, start with a root node, and might connect to other nodes, which means that could contain subtrees within them. Trees are defined by a certain set of rules: one root node may or may not connect to others, but ultimately, it all stems from one specific place. Some trees have even *more* specific rules, like binary search trees, which can only ever have *two* links to *two* nodes at any given time.

But what if we did something kind of crazy and justโฆthrew these rules out the window? Well, as it turns out, we totally *can* do that! Itโs just that we wouldnโt be dealing with trees anymoreโโโweโd be dealing with something called a **graph**.

Trees are nothing more than restricted types of graphs, just with many more rules to follow . A tree will always be a graph, but not all graphs will be trees.

So, what is it that makes a tree different from the large umbrella of graphs?

Well, for one thing, a tree can only flow in one direction, from the root node to either leaf nodes or child nodes. A tree can also only have one-way connectionsโโโa child node can only have one parent, and a tree canโt have any loops, or cyclical links.

With graphs, all of these restrictions go straight out the window. Graphs donโt have any concept of a โrootโ node. And why would they? Nodes can be connected in any way possible, really. One node might be connected to five others! Graphs also donโt have any notion of โone-directionalโ flowโโโinstead, they might have direction, or they might have no direction whatsoever. Or, to complicate matters further, they could have some links that have direction and others that donโt! But we wonโt get into that today.

Letโs stick with the simple stuff to start.

### Graphs with direction, and graphs without

Okay, so we know that graphs pretty much break all the rules that we know. However, there is one characteristic that every graph *must* have: every graph always needs to have, at the very least, one single node. Just as how trees need at least one root node in order to be considered a โtreeโ, similarly, a graph needs at least a single node in order to be considered a โgraphโ. A graph with just one node is usually referred to as a **singleton graph**, although we wonโt really be dealing with those.

Most of the graphs weโll be dealing with are a bit more complex. But, donโt be worriedโโโwe wonโt be diving into the *super complicated* graphs today. And trust me, some graphs really are complicated!

Instead, letโs look at the two types of graphs that are pretty easy to spot, and also pretty common in graph theory problems: directed graphs, and undirected graphs.

As we know, there are no real rules in the way that one node is connected to another node in a graph. Edges (sometimes referred to as *links*) can connect nodes in *any* way possible.

The different types of edges are pretty important when it comes to recognizing and defining graphs. In fact, thatโs one of the biggest and most obvious differentiators between one graph and another: the types of edges that it has. For the most part (aside from one exception, which we wonโt cover today), graphs can have two types of edges: a edge that has a direction or flow, and an edge that has no direction or flow. We refer to these as *directed* and *undirected* edges, respectfully.

In a **directed edge**, two nodes are connected in a very specific way. In the example below, node A connects to node B; there is only *one* way to travel between these two nodesโโโonly one *direction* that we can go. Itโs pretty common to refer to the node that weโre starting from as the *origin*, and the node that weโre traveling toas the *destination*. In a directed edge, we can **only travel** **from the origin to the destination**, and never the other way around.

However, itโs an entirely different story with undirected edges. In an **undirected edge**, the path that we can travel goes both ways. That is to say, the path between the two nodes is *bidirectional*, meaning that **the origin and destination nodes are not fixed**.

This differentiation is actually pretty important, because the edges in a graph determine what the graph is called. If all of the edges in a graph are *directed*, the graph is said to be a **directed graph**, also called **digraph***.* If all of the edges in a graph are *undirected*, the graph is said to beโโโyou guessed itโโโan **undirected graph**! Go figure, right?

This is all very cool, but at this point, I want to know two thingsโโโwhere did all of this graph stuff *come from*, exactly? Andโฆwhy should we care?

Letโs investigate.

### Tread lightly: weโre in graph country now

Computer science loves to borrow stuff. More specifically, it has borrowed a lot of concepts from logic and mathematics. As it turns out, this is the case with graphs.

Graph data structures as we know them to be computer science actually come from math, and the study of graphs, which is referred to as **graph theory**.

In mathematics, graphs are a way to formally represent a network, which is basically just a collection of objects that are all interconnected.

As it turns out, when computer scientists applied graph theory to code (and ultimately implemented graphs as data structures), they didnโt change a whole lot. So, a lot of the terms that we use to describe and implement graphs are the exact terms that weโll find in mathematical references to graph theory.

For example, in mathematical terms, we describe graphs as **ordered pairs**. Remember high school algebra, when we learned about *(x,y)* ordered pair coordinates? Similar deal here, with one difference: instead of *x* and *y*, the parts of a graph instead are: ** v**, for

*vertices*, and

**, for its**

*e**edges*.

The formal, mathematical definition for a graph is just this: **G = (V, E)**. Thatโs it! Really. I promise.

But hang on a secondโโโwhat if our graph has more than one node and more than one edge! In factโฆit will pretty much *always* have multiple edges if it has more than one node. How on earth does this definition work?

Well, it works because that ordered pairโโโ*(V, E)*โโโis actually made up of **two** **objects**: a *set* of vertices, and a *set* of edges.

Okay, that makes more sense to me now. But it would be a whole lot clearer if I had an example and actually wrote out the definition of a graph! So weโll do just that. In the example below, we have an undirected graph, with 8 vertices, and 11 edges.

So whatโs going on here?

Well, we wrote out our ordered pair *(V, E)*, but because each of *those* items is an object, we had to write those out as well. We defined *V* as an *unordered* set of references to our 8 vertices. The โunorderedโ part is really important here, because remember, unlike trees, there is **no hierarchy of nodes**! Which means that we donโt need to order them, since order doesnโt matter here.

We also had to define *E* as an object, which contains a bunch of edge objects within it. Notice yet again that our edge objects are also *unordered*. Why might that be? Well, what type of graph is this? Is there any direction or flow? Is there a fixed sense of โoriginโ and โdestinationโ?

Nope, thereโs not! This is an *undirected* graph, which means that the edges are bidirectional and the origin node and destination node are *not* fixed. So, each of our edge objects are also **unordered pairs**.

This particularity, of course, leads us to wonder: what if this were a **directed** graph? Time for another example! Hereโs a directed graph, with three vertices and three edges:

The way we define the vertices here doesnโt look any different, but letโs look more closely at our edge definition. Our edge objects in this case are *ordered* pairs, because direction actually matters in this case! Since we can only travel from the origin node to the destination node, our edges *must* be ordered, such that the origin node is the first of the two nodes in each of our edge definitions.

Cool, so thatโs how we define graphs. Butโฆwhen would we ever actually *use* graphs? Well, you probably used one today. You might just not know it yet! Time to change that.

### Super social graphs

Graphs are all around us, we just donโt always see them for what they are.

In fact, by the very act of reading this post, you are *literally on a graph right now.* The web is a massive graph structure! When we click between websites and navigate back and forth between URLs, weโre really just navigating through a graph. Sometimes those graphs have nodes with edges that are undirectedโโโI can go back and forth from one webpage to anotherโโโand others that are directedโโโI can only go from webpage A to webpage B, and never the other way around.

But thereโs an even better example that beautifully illustrates our daily interactions with graphs: social networks.

Facebook, a massive social network, is a type of graph. And if we think more about it actually functions, we start to better understand *how* we can define, and exactly what *type* of graph it is. On Facebook, if I add you as a friend, you must accept my request. Itโs not possible for me to be your friend on the network without you also being mine. The relationship between two users (read: nodes or vertices in graph terms!) is *bidirectional*. Thereโs no concept of an โoriginโ and a โdestinationโ nodeโโโinstead, youโre my friend and I am yours.

Can you guess what type of graph Facebook is implemented as?

If you guessed *undirected graph*, then youโre right! Well done. Relationships are two-way, so if we were to define Facebookโs friend network as a graph, its edges would all end up being unordered pairs when we wrote them out.

Twitter, on the other hand, works very differently from Facebook. I can follow you, but you might not follow me back. Case in point: I follow Beyonce, but she definitely does not follow me back (sadly).

We could represent Twitter as a *directed* graph. Each edge we create represents a *one-way* relationship. When you follow me on Twitter, you create an edge in the graph with your account as the origin node, and my account as the destination node.

So what happens when I follow you back? Do I change the edge you created when you followed me? Does it suddenly become bidirectional? Well, no, because I could unfollow you at any given point. When I follow you *back* on Twitter, I create a second edge, with *my* account as the origin node and yours as the destination.

The same model applies to dev.to, as well, which lets you follow and unfollow authors! In fact, this network model is *all over the place*. And all it is, once we abstract all the layers away, is a graph. And truly, what a powerful thing it is.

### Resources

Lots and lots of entire *books* have been written about graphs. I certainly didnโt cover enough information here to fill a book, but that doesnโt mean you canโt keep learning about graphs! Fill your mind with more graph theory awesomeness, starting with the great links below.

- Difference between trees and graphs, Poonam Dhanvani
- Whatโs the difference between the data structure tree and graph?, StackOverflow
- Applications of Graph Theory In Computer Science: An Overview, S.G.Shirinivas et. al.
- Graph Traversal, Professor Jonathan Cohen
- Data Structures: Introduction To Graphs, mycodeschool

๐๐

Graphs, Grammars, Lexing etc. are fundamental to computer science, nobody should go without at least knowing the basics about these, but still there are only few articles about theoretical computer science on dev related websites, but yours are just so nice ! Great job

This is wonderful! Thank you so much for putting this together - I wish I'd known about this resource when I first started trying to learn about graphs. Great work breaking it down and making it more accessible!!

Hey there, we see you aren't signed in. (Yes you, the reader. This is a fake comment.)

Please consider creating an account on dev.to. It literally takes a few seconds and

. โค๏ธwe'd appreciate the support so muchPlus, no fake comments when you're signed in. ๐

nice post, impressive. Thanks a lot.

It's nice to have mentioned the mathematics origin to explain graph. Many students get no explanation about why they have to "eat" a lot of mathematics when learning computer science. Unless the teachers are good, they won't describe the "from math to computer science" story making it less likely to trigger interest for the mathematics courses. It's common for students to drop university when they see they do math instead of computer science.

Nicely written.

I am new to the topic, introduced to me as a solution to the classic puzzle Instant Insanity. I have some problems with the terminology used, particularly the terms "vertices" and "edges" (I like much better your terms, nodes and links). A vertex by definition is a point where two or more lines meet to form an angle. However, some of the "vertices" in the graphs you've described (for example, Liz in the Facebook graph) are isolated, having only one line linked to them. How is this a vertex?

Also, I am confused about the term "edges". What makes it an edge (the outer or farthest point of something)?

I am sure there must be some reason why these terms were chosen. They have cropped up in other articles I've read too.

Would appreciate whatever insight you might be able to offer.

I loved this so much! Keep writing!

I'm glad you liked it!!

Excellent and very helpful but one thing I don't get, for instance in Python how would one define the order graph as distinct from the un-ordered notation, you seem to indicate the same notation? Surely there is some difference in code representation between order and unorder?

Maybe using lists and tuples? And then it's just a matter of interpretation, whether you treat your structures as those representing a directed or undirected graph.

Thanks for writing this, you've done a great job (re-)educating me about the fundamentals.

I'm interested in using (or perhaps even implementing) an in memory graph for a music selection program. I use C#, recommendations are welcome.

Oh and I have considered Neo4j, it's great but I'd like something more light weight.

Thank you. ๐น๐น๐น๐ผ๐ผ๐ท

Thanks for this! It was beautifully explained. Please write more. You have a knack for breaking down complexities and making them easy-to-digest. Thank you.

Great article! Thank you!

Magnifico!!

Very helpful and well explained!

Thanks for this article. I learned a lot :D

Really interesting post.

Really nice course. We want more !!

Comparing Facebook to an undirected graph and Twitter to a directional one sounds confusing. Nice article though!

Social networks are a pretty standard example of graphs (if you take a look at some CS curricula, you'll notice they come up again and again as widely-used examples).

Facebook or LinkedIn are both examples an undirected graph as "friendships" work two-ways. But on Twitter or Google+, the concept of "followers" rather than "friendships" make these graphs directed, and

notbidirectional.Hope this clears that up.