DEV Community

Brent Ochieng
Brent Ochieng

Posted on

Understanding Common Table Expressions (CTEs) in SQL — And Why They Matter in Data Analysis

If you’ve spent enough time writing SQL queries, you’ve probably found yourself staring at a long, repetitive, hard-to-read block of code. Maybe you copied the same subquery three times. Maybe you nested queries inside queries until your brain begged for mercy.

This is where Common Table Expressions, better known as CTEs, come in. They offer a simple way to write cleaner, more readable, and more maintainable SQL. And for data analysts, learning CTEs is one of those skills that instantly upgrades the quality of your work.

What Exactly Is a CTE?

A Common Table Expression (CTE) is a temporary, named result set that exists only for the duration of a single SQL statement.

Think of it like creating a mini-table, just for a moment, to help you break down a query into logical steps.

It usually starts with the keyword:

WITH

Enter fullscreen mode Exit fullscreen mode

A simple example:

WITH top_customers AS (
    SELECT customer_id, SUM(amount) AS total_spent
    FROM payments
    GROUP BY customer_id
)
SELECT *
FROM top_customers
WHERE total_spent > 1000;

Enter fullscreen mode Exit fullscreen mode

Here, top_customers is not a permanent table. It’s just a temporary result that makes the final query easier to manage.

Why Use a CTE?

CTEs shine in three major areas:

1. Readability

A long query becomes easier to understand when broken down into smaller steps.

2. Reusability

You can reference the CTE multiple times within the same query without repeating code.

3. Maintainability

If you need to modify something, you change it once inside the CTE instead of updating multiple subqueries.

For analysts who read and maintain SQL regularly, these benefits are huge.

How CTEs Are Used

You can use a CTE for almost anything, but here are the most common use cases:

1. Simplifying Complex Queries

Sometimes, your logic requires several transformations. With CTEs, you can approach SQL like a workflow: Step 1, Step 2, Step 3.

WITH cleaned_data AS (
    SELECT 
        order_id,
        customer_id,
        amount,
        DATE(order_date) AS order_date
    FROM orders
),
aggregated AS (
    SELECT 
        customer_id, 
        SUM(amount) AS total_spent
    FROM cleaned_data
    GROUP BY customer_id
)
SELECT *
FROM aggregated
ORDER BY total_spent DESC;

Enter fullscreen mode Exit fullscreen mode

Each section becomes easier to explain and debug.

2. Avoiding Repeated Subqueries

Without a CTE, you might write the same subquery multiple times.

WITH avg_sales AS (
    SELECT AVG(amount) AS avg_amount
    FROM sales
)
SELECT *
FROM sales
WHERE amount > (SELECT avg_amount FROM avg_sales);

Enter fullscreen mode Exit fullscreen mode

Instead of running the same calculation repeatedly, the CTE stores it once.

3. Creating Recursive Queries (Hierarchies)

This is one of the most powerful SQL features: recursive CTEs.

They're used for:

Employee-management hierarchies

Folder/subfolder structures

Generating sequences or running totals

Example:

WITH RECURSIVE hierarchy AS (
    SELECT 
        employee_id,
        manager_id,
        1 AS level
    FROM employees
    WHERE manager_id IS NULL

    UNION ALL

    SELECT 
        e.employee_id,
        e.manager_id,
        h.level + 1
    FROM employees e
    JOIN hierarchy h ON e.manager_id = h.employee_id
)
SELECT *
FROM hierarchy
ORDER BY level;

Enter fullscreen mode Exit fullscreen mode

This walks through an entire organizational structure using recursion.

4. Preparing Data for Visualization or Reporting

Most dashboards and reports need data that has been:

  • Cleaned
  • Aggregated
  • Filtered
  • Joined

CTEs help create step-by-step logic, which makes transformations clear and transparent.

When Should You Use a CTE?

CTEs are ideal when:

  • Your SQL query is becoming long or unreadable
  • You need to reuse the same result in multiple places
  • You want to logically break the query into stages
  • You’re calculating something step-by-step
  • You need recursion
  • You want to make your SQL explainable to teammates, managers, or future you

They may not be the best choice when performance is extremely critical, since some databases materialize CTEs and others don’t. But for most daily analytical tasks, CTEs are perfectly efficient.

Why CTEs Are Important in Data Analysis

In data analysis, clarity is everything. It’s not enough to get the right answer—your SQL needs to be readable, reproducible, and trustworthy.

CTEs help analysts:

✔ Build cleaner data pipelines
✔ Avoid errors from duplicated logic
✔ Explain logic clearly to stakeholders
✔ Work iteratively and transform data step by step
✔ Write analysis queries that can easily be revisited later

SQL can be intimidating when queries become too complex. CTEs allow you to think in layers, similar to dataframes in Python or steps in Power BI.

This makes your analysis process more structured and your results more reliable.

Final Thoughts

Learning CTEs is one of the fastest ways to become a better SQL writer. They make your queries more organized, more readable, and far easier to debug.

Whether you’re analyzing sales performance, building dashboards, or preparing datasets for machine learning, CTEs give you the clarity and structure you need to work efficiently.

If you’re moving into professional data work, mastering CTEs isn’t optional—it’s essential.

Top comments (0)