Introduction
SQL is one of the most important tools in data analysis because it allows analysts to retrieve, transform and analyze data directly from relational databases. Often, we will use basic queries but then as the datasets we are using grow, we will require advanced querying techniques like subqueries and Common Table Expressions (CTEs).
In this article, we are going to learn:
- What subqueries are and how they work
- Types of subqueries
- When to use subqueries
- What CTEs are
- Comparison between subqueries and CTEs
Subquery
A subquery is a query that is written inside another SQL query. It is also called an inner query or a nested query.
How it works
First, the inner query executes, then the result is passed to the outer query and finally the outer query uses that result to filter rows, apply aggregate functions like SUM, update data or delete rows based on conditions.
For example
This query will:
- Group customers by city
- Count how many customers each city has
- Select the city with the highest count
SELECT city
FROM customers
GROUP BY city
ORDER BY COUNT(*) DESC
LIMIT 1;
Result
Types of subqueries
To understand the different types of subqueries, we are going to use two tables: customers and orders table:
1. Single-Row Subquery
This subquery returns only one value
For example
To display customers who spent more than the average order amount
SELECT name
FROM customers
WHERE customer_id IN (
SELECT customer_id
FROM orders
GROUP BY customer_id
HAVING SUM(amount) > (
SELECT AVG(total_spent)
FROM (
SELECT SUM(amount) AS total_spent
FROM orders
GROUP BY customer_id
) AS customer_totals
)
);
Result:
2. Multi-Row Subquery
This subquery returns more than one value
For example
To display customers who have placed orders
SELECT name
FROM customers
WHERE customer_id IN (
SELECT customer_id
FROM orders
);
Result
3. Correlated Subquery
This is a subquery that depends on the outer query for its values
For example:
To display customers whose order is above their own average order value:
SELECT c.name, o.amount
FROM customers c
JOIN orders o ON c.customer_id = o.customer_id
WHERE o.amount > (
SELECT AVG(o2.amount)
FROM orders o2
WHERE o2.customer_id = o.customer_id
);
When to use Subqueries
- When filtering based on aggregated data
- When breaking down complex logic
- When performing dynamic comparisons
Common Table Expression (CTE)
A Common Table Expressions (CTE) is a temporary result set in SQL that you can reference within a single query.
It only exists during query execution.
For example
To calculate the total spending per customer:
WITH customer_spending AS (
SELECT
customer_id,
SUM(amount) AS total_spent
FROM orders
GROUP BY customer_id
)
SELECT
c.name,
cs.total_spent
FROM customers c
JOIN customer_spending cs
ON c.customer_id = cs.customer_id;
Uses of CTEs
CTEs are widely used because they help simplify and organize complex SQL queries. They:
- Improve readability as each step is clearly separated
- Handle hierarchical data
- Enable step by step transformations
Limitations of CTEs
- Temporary, a CTE only works while the query is running.
- They can be slower on very large datasets.
Subqueries vs CTEs
Here is a comparison of CTEs and Subqueries:
| Feature | Subqueries | *CTEs * |
|---|---|---|
| Readability | Low when nested | High |
| Reusability | Limited | High |
| Performance | Can be slower | Often optimized better |
| Best for | Simple filtering | Complex logic |
Conclusion
Subqueries and CTEs are powerful SQL tools that a good analyst should master. Subqueries are useful for quick embedded logic while CTEs provide clarity and scalability.







Top comments (0)