DEV Community

Cover image for Understanding Subqueries and CTEs in SQL
Tom Chege
Tom Chege

Posted on

Understanding Subqueries and CTEs in SQL

After getting comfortable with joins and window functions, I’m now diving into more advanced query structuring techniques.

Subqueries felt like a natural extension of filtering and aggregation. CTEs, on the other hand, introduced a cleaner way of breaking down complex logic into readable, step-by-step components.

This guide provides a practical comparison of both concepts using a simple student exam results dataset, highlighting when and why to choose one over the other.


What is a Subquery?

A subquery is a query nested inside another query. It runs first, and its result is used by the outer (main) query.

Subqueries can be placed in:

  • SELECT clause
  • FROM clause
  • WHERE or HAVING clause

Why Use Subqueries?

Subqueries are useful when you need to:

  • Perform intermediate calculations
  • Apply dynamic filtering
  • Break down complex logic without creating temporary tables
  • Keep queries self-contained

1. Subqueries in the WHERE Clause

This is the most common use case.

Example: Students scoring above the overall average

SELECT 
    student_id,
    marks
FROM exam_results
WHERE marks > (
    SELECT AVG(marks)
    FROM exam_results
);
Enter fullscreen mode Exit fullscreen mode

What’s happening:

  • The inner query calculates the average marks across all students.
  • The outer query filters for students who scored above that average.

Subqueries in the WHERE clause with IN and EXISTS

Using IN (for value comparison)

SELECT student_id
FROM exam_results
WHERE student_id IN (
    SELECT student_id
    FROM exam_results
    WHERE marks > 80
);
Enter fullscreen mode Exit fullscreen mode

Using EXISTS (for row existence check)

SELECT DISTINCT e1.student_id
FROM exam_results e1
WHERE EXISTS (
    SELECT 1
    FROM exam_results e2
    WHERE e2.student_id = e1.student_id
    AND e2.marks > 80
);
Enter fullscreen mode Exit fullscreen mode

Key Difference:

  • IN → Compares values against a list
  • EXISTS → Checks whether at least one matching row exists (usually more efficient with large datasets)

2. Subqueries in the SELECT Clause

Used to compute additional values for each row.

Example: Show each student’s marks alongside their personal average

SELECT 
    e1.student_id,
    e1.marks,
    (
        SELECT AVG(e2.marks)
        FROM exam_results e2
        WHERE e2.student_id = e1.student_id
    ) AS student_avg
FROM exam_results e1;
Enter fullscreen mode Exit fullscreen mode

Note: This is a correlated subquery — it runs once for every row in the outer query, which can impact performance on large tables.


3. Subqueries in the FROM Clause (Derived Tables)

Creates a temporary result set that can be treated like a table.

Example: Average marks per student

SELECT *
FROM (
    SELECT 
        student_id,
        AVG(marks) AS avg_marks
    FROM exam_results
    GROUP BY student_id
) AS student_avg;
Enter fullscreen mode Exit fullscreen mode

Important: Derived tables must have an alias.


Types of Subqueries

  • Scalar Subquery: Returns a single value
  • Column Subquery: Returns multiple rows (usually one column)
  • Correlated Subquery: References columns from the outer query and runs for each row

What is a CTE (Common Table Expression)?

A Common Table Expression (CTE) is a temporary named result set defined using the WITH clause. It makes queries more readable by breaking them into logical steps.

Think of it as creating a named mini-table that you can reference later in the same query.

Why Use CTEs?

  • Improved Readability — Break complex logic into named, logical steps instead of deeply nested subqueries
  • Reusability — Define a result set once and reference it multiple times in the same query
  • Easier Debugging — You can SELECT * from any CTE individually to test and verify intermediate results
  • Better Organization — Complex queries become a series of clear building blocks rather than one giant nested statement
  • Logical Flow — You can build your query step by step, making it easier to understand the thought process
  • Reduced Repetition — Avoid repeating the same subquery logic multiple times

CTEs are especially powerful when you need to perform multiple transformations or aggregations on the same dataset before joining everything together.

Basic CTE Example: Student averages

WITH student_avg AS (
    SELECT 
        student_id,
        AVG(marks) AS avg_marks
    FROM exam_results
    GROUP BY student_id
)
SELECT *
FROM student_avg;
Enter fullscreen mode Exit fullscreen mode

Types and Use Cases of CTEs

1. Simple CTE

WITH high_scores AS (
    SELECT *
    FROM exam_results
    WHERE marks > 80
)
SELECT * FROM high_scores;
Enter fullscreen mode Exit fullscreen mode

2. Multiple CTEs

WITH student_avg AS (
    SELECT 
        student_id,
        AVG(marks) AS avg_marks
    FROM exam_results
    GROUP BY student_id
),
top_students AS (
    SELECT *
    FROM student_avg
    WHERE avg_marks > 85
)
SELECT * FROM top_students;
Enter fullscreen mode Exit fullscreen mode

3. Recursive CTE

Used for hierarchical or recursive data (e.g., organizational charts, bill of materials). Advanced topic — not ccovered in this article.


Subqueries vs CTEs

Aspect Subqueries CTEs (Common Table Expressions)
Readability Can become hard to read when deeply nested Much cleaner and more structured
Reusability Logic must be repeated if used multiple times Defined once and can be reused multiple times
Performance Simple subqueries are fast; correlated ones can be slow Often better with modern optimizers; depends on usage
Debugging More difficult Easier (you can SELECT from each CTE separately)
Scope Limited to the immediate query part Available throughout the entire query

When to Use Each

Use Case Recommended Choice Reason
Simple filtering or single value Subquery Quick and concise
Complex, multi-step logic CTE Better organization and readability
Reusable intermediate results CTE Defined once, used multiple times
Row-by-row comparisons Correlated Subquery Runs for each row and can reference outer query values
Readability & maintainability CTE Easier to read, debug, and maintain

Conclusion

Both subqueries and CTEs help break down complex SQL problems, but they do so in different ways:

  • Subqueries are great for quick, inline logic that fits naturally inside another clause.
  • CTEs shine when your query becomes more complex and you want maximum readability and maintainability.

As you progress, the goal shifts from “writing queries that work” to “writing queries that are clear, efficient, and easy to maintain.”

Happy querying!

Top comments (0)