Graph databases
Let’s talk about some exciting new advancements in the world of data management – graph databases. To first delve into what graph databases are and why we should even think about them we must first understand what a graph is.
Graphs
A graph in computer science is very different to your normal x-y graphs that you may have seen in maths. In computer science a graph does not depict the nature of a relationship between two variables, instead is focuses on more the fact that there is some sort of relationship between objects.
A graph has some key components:
- A node / vertices is an object and is represented by a circle
- An edge is the relationship between objects and is represented by lines
- An edge can have an associated value, called it’s weight which could represent many things which we will discuss below, this graph is called a weighted graph
- Edges can also have a direction suggesting a one way relationship. This is called a directed graph.
Right now this may look very useless to you but graph theory is actually a very important part of theoretical computer science and maths, because of it’s real life applications. An example is modelling cities as nodes, and the edges as the roads in between them with the weight being the distance of that road, and then we can use Dijkstra’s shortest path algorithm to find the shortest path between two cities, which is what google maps does.
Graph Databases
Great, now we know what a graph is, we can discuss graph databases. A graph database stores data in the way we see in real life, by this I mean that all the data is stored in the form of a network of nodes, and relationships between said entities (credit to Neo4j):
In the diagram to above we have two nodes in our graph database (Dan and Ann), and Dan loves Ann, so there is a relationship between these two nodes, so we connect them. Some important thing to note are the fact that the love is one sided, so this graph is a directed and it also suggests that Dan’s love is only one sided. Another important feature that can be seen is how the Node, with label “Person”, has a property of “name”. This means that this person could have multiple properties such as height, or birthday.
This is an another beauty of graph databases – making use of objects. What is an object?. Well we can view the world as a collection of objects, such as people, birds, cars, etc, and each object has its own properties like how a car has the property of horsepower, but a person doesn’t. We use these objects as our nodes, and this is how we store he majority of our data.
This is an abstract way of thinking about data and very different from our usual relational databases where everything is stored in tables and there is a strict and rigid format that must be adhered to when making such databases. On the other hand graph databases is a more flexible and adaptable tool.
NoSQL databases
We can not discuss graph databases without discussing NoSQL databases. This is because graph databases are a type of NoSQL databases. NoSQL stand for “not only SQL”, or “non SQL”, it came up as a more flexible alternative for traditional SQL databases, because companies need to manage large data volumes at high speeds with the ability to scale up quickly to run modern web applications.
In this era of growth within cloud, big data, and mobile and web applications, NoSQL databases provide that speed and scalability. NoSQL also provides horizontal scaling which basically means that it is very easy to add more data and distribute workload equally.
Why even bother?
Graph databases have many real life applications because of their ability to go through hierarchies in data, find hidden connections, and even find inter relationships between items. So as our data continues to grow in both volume and the number of observable connections that may provide new information, graph databases are able to cope with these kinds of huge data sets while relational databases are slow and clunky for this kind of use case.
Some real life uses cases are:
- Social media recommendations
- Fraud detection
- Network management
- Knowledge graphs for AI
- Supply chain efficiency
Top comments (0)