DEV Community

John Wakaba
John Wakaba

Posted on • Edited on

Using CTEs for SQL Data Analysis

A CTE is a temporary result set in SQL that can be referred to within a SELECT, INSERT, UPDATE, DELETE query.

They improve readabaility and maintenability of queries and allow you to break down complex queries into modular components.

Defined using WITH keyword followed by the CTE name and the query that generates the result set.

WITH cte_name AS(
    SELECT column1,column2
    FROM table_name
    WHERE conditions
)
SELECT *
FROM cte_name


Enter fullscreen mode Exit fullscreen mode

Recursive CTEs

They are crucial for querying hierarchical data.

WITH RECURSIVE cte_name AS (
    SELECT column1,column2
    FROM table_name
    WHERE base_condition
    UNION ALL
    SELECT column1,column2
    FROM table_name
    JOIN cte_name ON condition_to_recurse
)
SELECT *
FROM cte_name;
Enter fullscreen mode Exit fullscreen mode

Total order quantity for each customer

WITH total_order_quantity AS (
 SELECT o.customer_id, SUM(o.quantity) AS total_quantity
 FROM orders o
 GROUP BY o.customer_id
)
SELECT c.first_name, c.second_name, tq.total_quantity
FROM customers c
JOIN total_order_quantity tq ON c.customer_id = tq.customer_id
ORDER BY tq.total_quantity DESC;

Enter fullscreen mode Exit fullscreen mode

Ranking customers based on total quantity

Using ROW_NUMBER()

WITH customer_rankings AS (
 SELECT o.customer_id, SUM(o.quantity) AS total_quantity,
 ROW_NUMBER() OVER (ORDER BY SUM(o.quantity) DESC) AS rank
 FROM orders o
 GROUP BY o.customer_id
)
SELECT c.first_name, c.second_name, cr.total_quantity, cr.rank
FROM customers c
JOIN customer_rankings cr ON c.customer_id = cr.customer_id
ORDER BY cr.rank;
Enter fullscreen mode Exit fullscreen mode

CTEs can be used for

  1. Aggregation (SUM, COUNT etc..)

  2. Ranking data (ROW_NUMBER, RANK)

  3. Recursive queries

  4. Filtering based on grouped conditions.

Final Thoughts

CTEs are an excellent tool for data analysts and database developers, offering a clean, reusable, and understandable way to manage complex SQL queries. They provide a balance between simplicity and functionality, making SQL-based data analysis more efficient.

Top comments (0)