DEV Community

Franck Pachot for Yugabyte

Posted on

Learn how to write SQL recursive CTE in 5 steps

You don't need a Graph database to follow a hierarchy of nodes. SQL can do joins, self-join, and even joins to its previous result, iteratively. This WITH RECURSIVE clause is often called "Recursive CTE". This might require a bit of abstraction for the developers used to procedural languages. Here is how I explain it by joining "manually" the two first levels, without recursion, and then switching to "recursive" to go further. Starting with the first two levels helps to validate the logic easily.

(0) Example: create the EMP table

I'll create the oldest sample table: EMP for employees with a self-reference to the manager:

CREATE TABLE emp (empno, ename,    job,        mgr,   hiredate,     sal, comm, deptno, email, other_info) as values
(7369, 'SMITH',  'CLERK',     7902, '1980-12-17',  800, NULL,   20,'SMITH@acme.com', '{"skills":["accounting"]}'),
(7499, 'ALLEN',  'SALESMAN',  7698, '1981-02-20', 1600,  300,   30,'ALLEN@acme.com', null),
(7521, 'WARD',   'SALESMAN',  7698, '1981-02-22', 1250,  500,   30,'WARD@compuserve.com', null),
(7566, 'JONES',  'MANAGER',   7839, '1981-04-02', 2975, NULL,   20,'JONES@gmail.com', null),
(7654, 'MARTIN', 'SALESMAN',  7698, '1981-09-28', 1250, 1400,   30,'MARTIN@acme.com', null),
(7698, 'BLAKE',  'MANAGER',   7839, '1981-05-01', 2850, NULL,   30,'BLAKE@hotmail.com', null),
(7782, 'CLARK',  'MANAGER',   7839, '1981-06-09', 2450, NULL,   10,'CLARK@acme.com', '{"skills":["C","C++","SQL"]}'),
(7788, 'SCOTT',  'ANALYST',   7566, '1982-12-09', 3000, NULL,   20,'SCOTT@acme.com', '{"cat":"tiger"}'),
(7839, 'KING',   'PRESIDENT', NULL, '1981-11-17', 5000, NULL,   10,'KING@aol.com', null),
(7844, 'TURNER', 'SALESMAN',  7698, '1981-09-08', 1500,    0,   30,'TURNER@acme.com', null),
(7876, 'ADAMS',  'CLERK',     7788, '1983-01-12', 1100, NULL,   20,'ADAMS@acme.org', null),
(7900, 'JAMES',  'CLERK',     7698, '1981-12-03',  950, NULL,   30,'JAMES@acme.org', null),
(7902, 'FORD',   'ANALYST',   7566, '1981-12-03', 3000, NULL,   20,'FORD@acme.com', '{"skills":["SQL","CQL"]}'),
(7934, 'MILLER', 'CLERK',     7782, '1982-01-23', 1300, NULL,   10,'MILLER@acme.com', null)
;

Enter fullscreen mode Exit fullscreen mode

The self reference from mgr to empno can be declared as:

ALTER TABLE emp ADD PRIMARY KEY (empno)
;
ALTER TABLE emp ADD FOREIGN KEY (mgr) REFERENCES emp (empno)
;
Enter fullscreen mode Exit fullscreen mode

This is not needed but it ensures the correctness of that that you query so that you can focus on your business logic rather than consistency test.

(1) find the root (level 0)

The first step is to identify the root of the graph. Do you want to start by the employees and show their manager, or start by the mangager and show its employees? I'll do the latter. The root of the hierarchy is the one with no manager (where mgr is null)

-- (1) query to find the root
select emp.empno,emp.ename,emp.mgr
from emp where mgr is null
;
Enter fullscreen mode Exit fullscreen mode

The result is simple:

yugabyte-> ;

 empno | ename | mgr
-------+-------+-----
  7839 | KING  |
(1 row)

yugabyte=>
Enter fullscreen mode Exit fullscreen mode

(2) define this level 0 in the WITH clause

I'll put it in a CTE (Common Table Expression), labelled as level0 and I select from it:

-- (2) define the root as as level 0 in a CTE
with recursive
level0 as (
-- (1) query to find the root
select emp.empno,emp.ename,emp.mgr
from emp where mgr is null
) 
-- (2) query this level
select * from level0
;
Enter fullscreen mode Exit fullscreen mode

The result is the same. The top manager is KING.

yugabyte-> ;

 empno | ename | mgr
-------+-------+-----
  7839 | KING  |
(1 row)

yugabyte=>
Enter fullscreen mode Exit fullscreen mode

I'm just building the blocks one by one. And I've left the comments to make it clear what is taken from the previous step and what is added. I already add RECURSIVE in the WITH clause but for the moment, as there are no self-reference, there will be only one iteration.

(3) define the join to the next level

Now that my level 0 is defined, I query the emp table again, to get the next level, with a join to level 0 to get the manager name:

-- (2) define the root as as level 0 in a CTE
with recursive
level0 as (
-- (1) query to find the root
select emp.empno,emp.ename,emp.mgr
from emp where mgr is null
) 
-- (3) join employees with level 0
select emp.empno,emp.ename,emp.mgr
,mgr.ename mgr_name
from emp
join level0 mgr on mgr.empno=emp.mgr
;
Enter fullscreen mode Exit fullscreen mode

I have labelled the level 0 as mgr because it will be the manager of this new level of employee, for which I keep the table name emp. Then the join on mgr.empno=emp.mgr links the two levels. In the join result, I have added mgr.ename as mgr_name in order to see the name of the manager in the same line.

So here here are the employees referring to KING:

yugabyte-> ;

 empno | ename | mgr  | mgr_name
------------+-------+------+----------
  7698 | BLAKE | 7839 | KING
  7566 | JONES | 7839 | KING
  7782 | CLARK | 7839 | KING
(3 rows)

yugabyte=>
Enter fullscreen mode Exit fullscreen mode

(4) concatenate the two levels with UNION

Showing the manager name was one step, but to show the whole hierarchy, we need to concatenate the rows from all levels. I start with my two first levels. I need a UNION ALL. For this, move my join query in the WITH clause and label it as leveln. I could have named it level1 because it joins with level0 but the goal is to make it an abstraction for the next level. Because I'll UNION ALL the two, they need the same columns. leveln has the manager name as mgr_name so I add this, as an empty string, when querying level0

-- (2) define the root as as level 0 in a CTE
with recursive
level0 as (
-- (1) query to find the root
select emp.empno,emp.ename,emp.mgr
from emp where mgr is null
),
-- (4) put the second level in the WITH clause
leveln as (
-- (3) join employees with level 0
select emp.empno,emp.ename,emp.mgr
,mgr.ename mgr_name
from emp
join level0 mgr on mgr.empno=emp.mgr
)
-- (4) concatenate the two levels
select *,'' mgr_name from level0
union all
select *             from leveln
;
Enter fullscreen mode Exit fullscreen mode

This starts to look good, here are the two levels together:

yugabyte-> ;

 empno | ename | mgr  | mgr_name
------------+-------+------+----------
  7839 | KING  |      |
  7698 | BLAKE | 7839 | KING
  7566 | JONES | 7839 | KING
  7782 | CLARK | 7839 | KING
(4 rows)

yugabyte=>
Enter fullscreen mode Exit fullscreen mode

This is where you can validate your result. The next step will be automatic to add recursion.

(5) get it in one recursive WITH clause

Now that I have validated the result for the two first levels, it is time to get it recursive. Instead of joining to level0 I'll join to leveln which is the previous result of itself. And I move the UNION ALL into this leveln CTE.

-- (2) define the root as as level 0 in a CTE
with recursive
level0 as (
-- (1) query to find the root
select emp.empno,emp.ename,emp.mgr
from emp where mgr is null
),
-- (4) put the second level in the WITH clause
leveln as (
-- (5) put the UNION ALL in the recursive level
-- (4) concatenate the two levels
select *,'' mgr_name from level0
union all
-- (3) join employees with level 0
select emp.empno,emp.ename,emp.mgr
,mgr.ename mgr_name
from emp
-- (5) change level0 to leveln to join iteratively
join leveln mgr on mgr.empno=emp.mgr
)
-- (5) query this level n
select *             from leveln
;
Enter fullscreen mode Exit fullscreen mode

If you have validated the join with the first two level, this last transformation is automatic. At execution time, the first iteration will get the first level and the second part of the UNION ALL will have no rows because of the join with the previous result, which is empty. The next iteration will add the result of the join. And again until the join returns no rows because we are at a leaf, with the previous result, taken as a list of managers, has no employees.

yugabyte-> ;

 empno | ename  | mgr  | mgr_name
------------+--------+------+----------
  7839 | KING   |      |
  7698 | BLAKE  | 7839 | KING
  7566 | JONES  | 7839 | KING
  7782 | CLARK  | 7839 | KING
  7788 | SCOTT  | 7566 | JONES
  7902 | FORD   | 7566 | JONES
  7844 | TURNER | 7698 | BLAKE
  7499 | ALLEN  | 7698 | BLAKE
  7521 | WARD   | 7698 | BLAKE
  7654 | MARTIN | 7698 | BLAKE
  7900 | JAMES  | 7698 | BLAKE
  7934 | MILLER | 7782 | CLARK
  7876 | ADAMS  | 7788 | SCOTT
  7369 | SMITH  | 7902 | FORD
(14 rows)

yugabyte=>
Enter fullscreen mode Exit fullscreen mode

This is where the magic is. If you carefully test your first two levels, all others will come by transforming the query to be recursive, iterating by joining on its previous result. Note that I've left the first level as a level0 CTE. You can put it into the leveln but I think it is more readable like this.

In the execution plan, you see it as Recursive union

                                                              QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------
 CTE Scan on leveln  (cost=20.23..22.86 rows=131 width=72) (actual time=0.023..0.281 rows=14 loops=1)
   CTE leveln
     ->  Recursive Union  (cost=0.00..20.23 rows=131 width=46) (actual time=0.021..0.269 rows=14 loops=1)
           ->  Seq Scan on emp  (cost=0.00..1.14 rows=1 width=46) (actual time=0.019..0.021 rows=1 loops=1)
                 Filter: (mgr IS NULL)
           ->  Hash Join  (cost=0.33..1.65 rows=13 width=46) (actual time=0.054..0.058 rows=3 loops=4)
                 Hash Cond: (emp_1.mgr = mgr.empno)
                 ->  Seq Scan on emp emp_1  (cost=0.00..1.14 rows=14 width=14) (actual time=0.002..0.003 rows=14 loops=4)
                 ->  Hash  (cost=0.20..0.20 rows=10 width=36) (actual time=0.044..0.044 rows=4 loops=4)
                       ->  WorkTable Scan on leveln mgr  (cost=0.00..0.20 rows=10 width=36) (actual time=0.001..0.001 rows=4 loops=4)
Enter fullscreen mode Exit fullscreen mode

The number of loops tells you how many time the second part was executed. All this runs the same in PostgreSQL and YugabyteDB.

Discussion (8)

Collapse
geraldew profile image
geraldew

Always good to see articles describing how recursive queries work - disclaimer, I've also written one here on Dev (but it wasn't intended as an introduction to the concept).

With recursion, there's always a need to be sure that the looping will end. I'm not seeing anything here to be certain of that - other than it being known to be true for the example data. There are various possible solutions (I have a my own preferred method) so maybe that would make a good follow-up piece.

Collapse
alfiqmiq profile image
Marek • Edited on

Always good to include hints in query (example for MySQL):
WITH
...
SELECT /*+MAX_EXECUTION_TIME(1000) SET_VAR(CTE_MAX_RECURSION_DEPTH=100)*/
...

Collapse
detzam profile image
webstuff

this one is quite interesting. But its not PL SQL, is it?

Collapse
markusnetz profile image
Markus Netz

No, it is standard sql

Collapse
michaeldoshea profile image
/* Michael D O'Shea */

It is Postgres.

In the Oracle flavour of SQL (not PL/SQL as you wrote, that is something very different) I wrote this up a few years ago strychnine.co.uk/2017/08/postgresq... I rabbit on a bit in there but the difference between Oracle and Postgres at the time was far from dramatic, at least for this functionality.

Thread Thread
franckpachot profile image
Franck Pachot Author • Edited on

Michael, to be fair, Oracle had CONNECT BY from the initial version (seen in 2.3 User Guide from 1981 🤓), many years before WITH RECURSIVE came into the SQL standard. They were late to the ANSI standard but early on the feature

Thread Thread
michaeldoshea profile image
/* Michael D O'Shea */

Yeah I get it Frank, it's like DECODE (now implemented CASE WHEN ... END), NVL and its friends, the (+) for outer joins and so on, and all remain in the implementation as Oracle incorporated some features/functions in-part early on and the ANSI/ISO standards have since "caught up". Oracle hasn't for the most part had language feature breaking changes either, so all this junk DNA remains in their flagship product (well the removal of WM_CONCAT was a breaking change, but it was an undocumented feature as an interim approach to LISTAGG so doesn't really count).

Perhaps another way of looking at all of this is that Oracle were leading the way winning the awards at the beginning, but now a bit of a dinosaur and not winning the awards now. Younger spunky database implementations don't carry the baggage of these early RDBMS implementations. I include Postgres in this category, and its forks (eg. Netezza), that implement CTE's in a more standardised way.

Even under the bonnet Oracle can be a little sad too, as you know. I think it was Oracle 9 (maybe 9.2 before it worked) that Oracle implemented ANSI joins. When you look at a 10046 trace file however, you can see the ANSI SQL is rewritten back to the proprietary format, some sort of internal transpiler shoehorned in Oracle 9, and two semantically equivalent queries (one using ANSI joins and one using (+)), both hinted ..... well the ANSI code often loses the hints in the rewrite and the execution plan doesn't adhere to the "rule" (yeah yeah, hints are rules, we both understand this too).

On the subject of execution hints, and Postgres, well it's just not there and the (non) implementation defended wiki.postgresql.org/wiki/Optimizer... It will be interesting to see how this changes in the future, if it changes. If it changes, someone will be writing a nice blog article on it too. I will keep my eyes peeled :-)

Thread Thread
franckpachot profile image
Franck Pachot Author

Yes, ANSI joins introduced 20 years ago and still many bugs 🤷🏼‍♂️
About pg_hint_plan, at in YugabyteDB we have it by default. I see no reason not to allow easy troubleshooting / workaround.