DEV Community

Ahmed Hisham
Ahmed Hisham

Posted on

Comprehensive Guide On Using (Count) Aggregation Function In Apache AGE

This blog assumes you already familiar with Apache AGE extension which is an extension for PostgreSQL that adds graph database functionalities to relational database. I chose Count function since it is a bit tricky to a lot of developers, so by the end of this blog you should feel comfortable using it. We will start out with a simple example, then moving to a hard one.

Setting up a database

First we need to create a test database test_db and create age extension for it:

postgres=# CREATE DATABASE test_db;
CREATE DATABASE
postgres=# \c test_db

test_db=# CREATE EXTENSION age;
CREATE EXTENSION
test_db=# LOAD 'age';
LOAD

Enter fullscreen mode Exit fullscreen mode

data setup from docs:

SET search_path TO ag_catalog;

SELECT create_graph('graph_name');

SELECT * FROM cypher('graph_name', $$
    CREATE (a:Person {name: 'A', age: 13}),
    (b:Person {name: 'B', age: 33, eyes: "blue"}),
    (c:Person {name: 'C', age: 44, eyes: "blue"}),
    (d1:Person {name: 'D', eyes: "brown"}),
    (d2:Person {name: 'D'}),
    (a)-[:KNOWS]->(b),
    (a)-[:KNOWS]->(c),
    (a)-[:KNOWS]->(d1),
    (b)-[:KNOWS]->(d2),
    (c)-[:KNOWS]->(d2)
$$) as (a agtype);
Enter fullscreen mode Exit fullscreen mode

Simple example

Now let's start with a simple example to show the basic use of Count function:

SELECT *
FROM cypher('graph_name', $$
    MATCH (n)
    RETURN count(*)
$$) as (total agtype);
Enter fullscreen mode Exit fullscreen mode

In this query MATCH will fetch all vertices in the graph, and then by the RETURN count(*) part we inform the query to return the count of all vertices (***** means all here). You should already be familiar with that, the result we will get in our example will be:

 total
-------
 5
(1 row)
Enter fullscreen mode Exit fullscreen mode

Harder example

Let's move to more serious example, don't be shocked with what you will see, we will explain everything, these are the required data setup:

SELECT * FROM cypher('graph_name', $$
CREATE (:L {a: 1, b: 2, c: 3}),
       (:L {a: 2, b: 3, c: 1}),
       (:L {a: 3, b: 1, c: 2})
$$) as (a agtype);
Enter fullscreen mode Exit fullscreen mode

Now let's run the following query (we will explain it down below):

SELECT * FROM cypher('graph_name', $$
    MATCH (x:L)
    RETURN (x.a + x.b + x.c) + count(*) + count(*), x.a + x.b + x.c
$$) as (count agtype, key agtype);
Enter fullscreen mode Exit fullscreen mode

you should get the following output:

 count | key
-------+-----
 12    | 6
(1 row)
Enter fullscreen mode Exit fullscreen mode

Let's explain What happened here, first you need to understand the concept of grouping key, grouping key is what count function will exactly count, x.a + x.b + x.c is the grouping key in this example. Let's see how see how this works in detail

RETURN (x.a + x.b + x.c) + count(*) + count(*), x.a + x.b + x.c
Enter fullscreen mode Exit fullscreen mode

as x.a + x.b + x.c is the grouping key, we need to find how many groups it makes, according to the vertices we created above
(:L {a: 1, b: 2, c: 3}),
(:L {a: 2, b: 3, c: 1}),
(:L {a: 3, b: 1, c: 2})

in all cases (1 + 2 + 3), (2 + 3 + 1) or (3 + 1 + 2) they all equal 6 so they fall into the same group, which is one group: 6, now count function will count how many 6 has showed up, it finds that 6 appeared 3 times, so count(*)=3. So if we computed left column we will get 12:
(x.a + x.b + x.c) + count(*) + count(*) =
(1 + 2 + 3) + 3 + 3 = 12

that's why we got the output:

 count | key
-------+-----
 12    | 6
(1 row)
Enter fullscreen mode Exit fullscreen mode

the right part represent the groups that can be formed which will be only one group (6), and the left part yields to 12 for that group!.

congratulations!, by now you should have much deeper understanding of how count function with grouping keys.

Top comments (0)