DEV Community

Stacy Omwoyo
Stacy Omwoyo

Posted on

JOINS AND WINDOW FUNCTIONS ON POSTGRESQL.

INRODUCTION

In relational databases such as PostgreSQL, data is often stored across multiple related tables. To extract meaningful insights that will be key in decision-making, we must combine and analyze this data efficiently. The two most powerful ways to go about it are;

  1. Using Joins- combines data from multiple tables
  2. Using Window Functions- performs calculations across rows related to the current row without collapsing the result set.

Understanding how these two concepts work is fundamental for data analysis, backend developpers and database engineers.

PART 1: JOINS IN PostgreSQL

What is a Join?

A join combines rows from two or more tables based on related columns between them. One table will contain a primary key while the other contains a foreign key. The values in said keys are identical, which creates the correlation.

There are various types of joins:

  1. Inner Join
  2. Right Join
  3. Left Join
  4. Natural Join
  5. Cross Join
  6. Self Join

1.INNER JOIN
Returns data from matching rows in both tables. Used when you only want records that exist in both tables.

select c.first_name, c.last_name, o.quantity 
from clients c
INNER JOIN orders o
on c.customer_id = o.customer_id; 
Enter fullscreen mode Exit fullscreen mode

2.RIGHT JOIN
Returns all the rows from the right table and matching rows from the left table. If no match exists, NULL is returned.

select c.first_name, c.last_name, o.quantity
from clients c 
RIGHT JOIN orders o 
on c.customer_id= o.customer_id ;
Enter fullscreen mode Exit fullscreen mode

3.LEFT JOIN
Opposite of right join; returns all rows from the left table.

select c.first_name, c.last_name, o.quantity
from clients c 
LEFT JOIN orders o 
on c.customer_id= o.customer_id ;
Enter fullscreen mode Exit fullscreen mode

4.FULL OUTER JOIN
Returns all rows when there is a match on either left or right table.

select c.first_name, c.last_name, o.quantity
from clients c 
FULL OUTER JOIN orders o 
on c.customer_id= o.customer_id ;
Enter fullscreen mode Exit fullscreen mode

5.SELF JOIN
Joins a table to itself.

select e.name as Employee, m.name as manager
from employees e
inner join employees m on e.manager_id =m.employee_id;
Enter fullscreen mode Exit fullscreen mode

6.CROSS JOIN
Can join more than two tables and return data from all the selected tables. This is very common in real-world business settings.

select b.title, o.quantity 
from books b 
INNER join orders o 
on b.book_id= o.book_id ;
Enter fullscreen mode Exit fullscreen mode

7.NATURAL JOIN
Joins data from columns with the same name.

select * from books 
natural join orders;
Enter fullscreen mode Exit fullscreen mode

PART 2: WINDOW FUNCTIONS IN POSTGRESQL.

What are Window Functions in PostgreSQL?

Window Functions perform calculations across a set of table rows related to the current row. Unlike the GROUP BY, they don't collapse rows. This is achieved by the use of the OVER() clause.

Window functions include the following:
1.ROW_NUMBER()
Assigns a unique number to each row. It is very useful in ranking and pagination.

SELECT name,
       ROW_NUMBER() OVER (ORDER BY amount DESC) AS row_num
FROM orders;
Enter fullscreen mode Exit fullscreen mode

2.RANK()
Assigns numbers to values in a group of values with a similar feature. It skips numbers when duplicates appear.

SELECT name,
       RANK() OVER (ORDER BY amount DESC) AS rank
FROM orders;
Enter fullscreen mode Exit fullscreen mode

3.DENSE_RANK()
Similar to the Rank function, only it does not skip duplicates when assigning numbers.

SELECT name,
       DENSE_RANK() OVER (ORDER BY amount DESC) AS rank
FROM orders;
Enter fullscreen mode Exit fullscreen mode

4.PARTITION BY()
It divides rows into groups before applying the function.

SELECT customer_id,
       amount,
       RANK() OVER (PARTITION BY customer_id ORDER BY amount DESC) AS rank_per_customer
FROM orders;
Enter fullscreen mode Exit fullscreen mode

5.RUNNING TOTAL (Cumulative Sum)
Calculates the cumulative total per row as desired.

SELECT customer_id,
       order_date,
       SUM(amount) OVER (
           PARTITION BY customer_id
           ORDER BY order_date
       ) AS running_total
FROM orders;
Enter fullscreen mode Exit fullscreen mode

6.LAG() and LEAD()
Used to compare rows. They are very useful in trend analysis as they work hand in hand. LAG looks backward, while LEAD looks forward.

SELECT customer_id,
       amount,
       LAG(amount) OVER (PARTITION BY customer_id ORDER BY order_date) AS previous_order
FROM orders;
Enter fullscreen mode Exit fullscreen mode

IT IS IMPORTANT TO NOTE THE FOLLOWING WHILE USING WINDOW FUNCTIONS
1.Ensure you use the PARTITION BY function carefully, failure to do so results in the whole table being treated as one group.
2.Check the ORDER BY function, as it controls the order of calculations in the window function.
3.Optimize performance; window functions can be slow on large datasets; use indexes if needed.

COMBINING JOINS AND WINDOW FUNCTIONS

In real-world analysis, both are often used together.
We can come up with a query that joins tables, aggregates totals and applies ranking in one go.

SELECT name,
       SUM(amount) AS total_spent,
       RANK() OVER (ORDER BY SUM(amount) DESC) AS spending_rank
FROM customers c
JOIN orders o ON c.customer_id = o.customer_id
GROUP BY name;
Enter fullscreen mode Exit fullscreen mode

CONCLUSION

Joins and Window Functions are foundational tools in PostgreSQL. Where Joins allow you to merge related data across tables and Window Functions allow advanced analytical calculations while preserving row-level detail.

A mastery in both gurantees expertise in
-Sales Analysis
-Customer Ranking
-Time-series Analysis
-Business Intelligence reporting
-Data Engineering Workflows

Together they transform raw relational data into actionable insights that will come in handy when it comes to decision-making.

Top comments (0)