DEV Community

Cover image for Sub-queries vs Window Functions vs Common Table Expressions: Which Should You Use in SQL?
Braeson Nyahera
Braeson Nyahera

Posted on

Sub-queries vs Window Functions vs Common Table Expressions: Which Should You Use in SQL?

SQL is the most used tool for data manipulation but what happens when there are some concepts that seem to work towards the same ultimate outputs. As queries become more complex, developers often encounter different techniques that appear to produce similar results—namely sub-queries, common table expressions (CTEs), and window functions.
At first glance, these concepts can seem interchangeable. You might solve the same problem using any of them, which raises a natural question: which one is better?
The goal is not to choose one, but to understand when each is the right tool.
In this article I will discuss about sub-queries, window functions and common table expressions;

Sub-Queries
These are queries inside other queries and are executed either before or alongside the outer query. They can return scalar value, row or table.

SELECT *
FROM employees e
WHERE salary = (
    SELECT MAX(salary)
    FROM employees
    WHERE department = e.department
);
Enter fullscreen mode Exit fullscreen mode

Window functions
These are used when you are trying to preserve your rows. Used together with components such as - OVER(),PARTITION BY,ORDER BY and common functions such as - SUM(),AVG(),RANK(),LAG(),LEAD()

SELECT employee_id,first_name, last_name,salary, job_title, department
FROM (
  SELECT *,
    RANK() OVER(PARTITION BY department ORDER BY salary DESC) AS rnk
  FROM employees
) t
WHERE rnk = 1;
Enter fullscreen mode Exit fullscreen mode

Common Table Expressions (CTEs)
These are temporary results set that are initialized using -WITH. They are often used to improve readability and modularity of scripts.

WITH max_salary AS (
    SELECT department, MAX(salary) AS max_sal
    FROM employees
    GROUP BY department
)
SELECT e.*
FROM employees e
JOIN max_salary m
    ON e.department = m.department
WHERE e.salary = m.max_sal;
Enter fullscreen mode Exit fullscreen mode

All these features of SQL are used to perform advanced data querying and transformations. They can all be tweaked to perform almost similar queries but it all depends on what the final output is expected to be.

When to use what
Each of the features has an instance when it is the most convenient over the others. 

  • For instance when it is a simple filtering such as retaining all those above average marks then a sub-query is the best option.
  • When we are more focused on row level analytics then the window functions come in handy for that task.
  • In a situation where it is a multi-step logic and re-usability is a core factor the common table expressions are best suited for use.

Using in collaboration
The best thing about this features is that they can be used together for the best solution. This helps in the data transformation where there are steps within it in which either is considered to be superior over the other.
Here is an example where all of them are used together:

with ranked AS (
    SELECT 
        *,
        AVG(salary) OVER (PARTITION BY department) AS avg_salary
    FROM employees
)
SELECT *
FROM ranked
WHERE 
    salary > avg_salary
    AND salary > (
        SELECT AVG(salary) FROM employees
    );
Enter fullscreen mode Exit fullscreen mode

Best Practices
These are some of the recommended practices when working with these SQL features:

  • It is best to avoid deep nesting of queries by using CTEs which help in the structuring of the script for easy readability.
  • When using CTEs it is necessary to name them well so that their purpose can be identified easily without having to read all the code snippets manually.
  • Window functions are often more efficient than sub queries in large datasets, so where possible it is best to use window functions over sub queries but in instances of small datasets any can be used depending on preference.

Conclusion
These are to be considered more as complementary tools than competing tools. In most advanced queries you will have to use them together or interchangeably to get to the results. 
What makes a major difference is how you use them as they can heavily impact the performance. It is best to master all of them and understand the strength and weaknesses of each so that when it get to a point of picking one or more you do it from a point of information.

Top comments (0)