Introduction
In this article I will give an introduction to the internals of Apache AGE, specifically how the VLE (Variable Length Expressions) works.
Prerequisites
A basic knowledge of PostgreSQL or any other Relational Database Management System (RDBMS), including an understanding of tables, schemas, and commonly used SQL queries such as SELECT, INSERT, DELETE, and UPDATE, is essential. Additionally, proficiency in JSON (JavaScript Object Notation) is also necessary.
How Variable-Length Expressions (VLE) Works
VLE allows users to express variable-length patterns in a graph, which can be useful for a wide range of applications, such as recommendation systems, fraud detection, and social network analysis.
The VLE function in Apache AGE is responsible for loading the relevant parts of the graph into memory and executing the path-finding algorithm. This algorithm is a graph traversal algorithm that works on a graph, not a tree. It searches for all paths of any length from all vertices to all vertices, which can result in a high number of computations, especially for large graphs. The speed of the algorithm is directly related to the size and connectedness of the graph and what it is looking for.
For instance, look at the following queries:
MATCH p=(u)-[*]-(v) RETURN p
MATCH (u)-[*]-(v) RETURN u
MATCH ()-[*]-() RETURN COUNT(1);
All of these queries will basically do the same thing - search for all paths of any length from all vertices to all vertices. This means that for a graph of 1 million nodes, this algorithm will be invoked up to 1 trillion times (the number of VxV combinations). Now, add that you are asking for all of the paths between each combination. So, each additional path length can add to the work done.
To minimize the workload, it is best to limit the algorithm's search scope by restricting where and how deeply it needs to look. This means restricting where it needs to look and how deeply it needs to look. Just because you restrict the input vertices doesn't mean that you restrict the interior vertices. The algorithm doesn't care about them, it only cares about interior edges.
It is worth mentioning that the VLE function in Apache AGE generates all valid paths from a start vertex and leaves the match against the terminal vertex to Postgres, due to implementation factors.
Apache AGE repository: https://github.com/apache/age
Apache AGE website: https://age.apache.org
Top comments (0)