DEV Community

Cover image for Learning SQL JOIN and Window Functions
Benta Okoth
Benta Okoth

Posted on

Learning SQL JOIN and Window Functions

SQL JOINS

  • JOIN clause is used to combine rows from two or more tables, based on a related column between them
  • To use joins we need to have a primary key and a foreign key
  • A Primary Key - uniquely identifies each row in a table. It must contain unique values. Cannot contain NULL. Only one primary key per table (but it can be made of multiple columns – composite key)

  • A Foreign Key is a column that references the Primary Key of another table. It creates a relationship between tables. Can contain duplicate values. Can sometimes contain NULL. Ensures referential integrity (you cannot reference a non-existing record)

  • We are going to use the two below tables to explain the concept of join.

select * from employees e;

select * from departments d ;

There are different types of JOINs in SQL:

(i). (INNER) JOIN: Returns only rows that have matching values in both tables

select e.name, d.department_name
from employees e
inner join departments d on e.department_id = d.department_id;

(ii). LEFT (OUTER) JOIN: Returns all rows from the left table, and only the matched rows from the right table

  • Example: all departments and only the employees with department assigned to them.

select e.name, d.department_name
from departments d
left join employees e on d.department_id = e.department_id;

(iii). RIGHT (OUTER) JOIN: Returns all rows from the right table, and only the matched rows from the left table

  • Example: employees and all departments(Empty departments) The table that comes after the word right join

select e.name, d.department_name
from employees e
right join departments d on e.department_id = d.department_id;

(iv). FULL (OUTER) JOIN: Returns all rows when there is a match in either the left or right table

select e.name, d.department_name
from employees e
full outer join departments d on e.department_id=d.department_id;

(v). CROSS JOIN: Returns every combination of rows from both tables
-Example - Every employee with every project ie if one employee has 3 project we shall have the name repeated thrice and the projects will differ
select e.name, p.project_name
from employees e
cross join projects p;

(vi). SELF JOIN :- Is a regular join, but the table is joined with itself.

  • Example -Show employees and their managers select e.name as employee, m.name as manager from employees e join employees m on e.manager_id = m.employee_id;

(vii). NATURAL JOIN - it automatically joins all tables using columns that have the same names

  • department_id will be the unifying factor(all the rows and all the columns from both tables)

select *
from employees e
natural join departments d ;

## WINDOW FUNCTIONS

  • A window function in SQL is a function that performs a calculation across a set of rows that are related to the current row, without collapsing those rows into a single result.

  • This is the key difference from GROUP BY.
    GROUP BY → combines rows and returns one row per group.
    Window functions → keep every row but add a calculated value based on other rows in the "window".

#### Ranking window functions

a. The RANK() function - The RANK() function assigns a rank to each row based on the order specified within the window. Rows with the same values receive the same rank, and the next rank is skipped.

select title, price, rank() over (order by price desc) as price_rank
from books;

b.The DENSE_RANK() function - The DENSE_RANK() function operates similarly to the RANK() function except it does not skip any ranks even if rows have the same values.

select title, price, dense_rank() over (order by price desc) as price_rank
from books;

c.The ROW_NUMBER() function - The ROW_NUMBER() function assigns a unique sequential number to each row within a partition, regardless of the column values. It makes sure that no two rows can have the same row number within a division.

d.The NTILE() function - divides sorted partitions into n-number of equal groups. Each row in a partition is assigned a group number.
select c.first_name, c.last_name,
sum(o.quantity) as total_quantity,
ntile(3) over (order by sum(o.quantity) desc) as quantity_tile
from orders o
join customer c on o.customer_id = c.customer_id
group by c.first_name, c.last_name;

## The Analytical Window Functions

The LEAD() function - The LEAD(column, n) function allows access of a value within a column from the
following nth-row relative to the current row. It is the counterpart of the LAG() function. The syntax for lead and lag are the same

select o.order_id, o.customer_id, o.quantity,
lag(o.quantity) over ( order by o.order_id ) as next_quantity
from orders o;

The LAST_VALUE() function- The LAST_VALUE() function allows the retrieval of the value of a column from the last row
within a window frame. Operates similar to FIRST_VALUE()

SELECT
Start_date,
Department,
First_name,
LAST_VALUE(First_name) OVER (
ORDER BY Start_date) AS Last_employee
FROM
Employee
ORDER BY
Department;
Enter fullscreen mode Exit fullscreen mode

The Aggregate Window Function

SELECT
    c.first_name,
    c.last_name,
    o.quantity,
    SUM(o.quantity) OVER (PARTITION BY c.customer_id) AS totalquant_
FROM
    orders o
INNER JOIN
    customer c ON o.customer_id = c.customer_id;
Enter fullscreen mode Exit fullscreen mode


select o.customer_id, c.first_name, c.last_name, sum(o.quantity) as total_quantity,
avg(b.price) as avg_price,
sum(o.quantity) over (partition by o.customer_id) as total_order_quantity
from customer c 
join orders o on c.customer_id = o.customer_id
join books b on o.book_id = b.book_id
group by c.first_name, c.last_name, o.quantity, o.customer_id;
Enter fullscreen mode Exit fullscreen mode
SELECT
Department,
First_name,
Salary,
MIN(Salary) OVER (
PARTITION BY Department) AS Min_salary
FROM
Employee
ORDER BY
Department;
Enter fullscreen mode Exit fullscreen mode

Top comments (0)