DEV Community

Wangeci Ndovu
Wangeci Ndovu

Posted on

SQL Joins and Window Functions: The Difference Between Combining Data and Analyzing It

Let us talk about joins and windows functions in sql

Joins combine tables

Window Functions analyze data without collapsing it

Many beginners confuse the two. Let’s break them down properly step y step with real examples, clear explanations, and practical queries

Part 1- SQL Joins — Combining Data Across Tables

Imagine you have two tables:

customers

customer_id | first_name|second_name
1           | Alice     |Johnson
2           | Bob       |Njagi
Enter fullscreen mode Exit fullscreen mode

orders

order_id| customer_id |Order_amount
1       | 1           | 250
2       | 1           | 300
3       | 2           | 150
Enter fullscreen mode Exit fullscreen mode

If you want to see who made which order, you need a ##JOIN##.

What is a JOIN?

A JOIN allows you to combine rows from two or more tables based on a related column.

In simple terms:

A JOIN connects tables using a common key.

INNER JOIN

Returns only matching rows.

SELECT customer_id, order_id,
order_amount,
FROM customers,
INNER JOIN orders,
ON customer_id = customer_id;
Enter fullscreen mode Exit fullscreen mode

Result:

Alice | 1 | 250
Alice | 2 | 300
Bob   | 3 | 150
Enter fullscreen mode Exit fullscreen mode

If there’s no match, the row is excluded

LEFT JOIN

Returns all rows from the left table, even if there’s no match.

SELECT customer_name,
order_id,orders_amount,
FROM customers,
LEFT JOIN orders,
ON customer_id = customer_id;
Enter fullscreen mode Exit fullscreen mode

If a customer has no orders, they still appear with NULL values for order columns.

RIGHT JOIN

Opposite of LEFT JOIN returns all rows from the right table.

FULL OUTER JOIN

Returns all rows from both tables matched where possible.

Key Insight About Joins

Joins increase columns.

They bring data from multiple tables into a single result set.

They do NOT calculate ranking, running totals, or row by row analytics.

That’s where Window Functions come in.

Part 2- Window Functions Analyzing Without Collapsing Data

Window functions are different.

They:

  • Do NOT reduce rows (unlike GROUP BY)

  • Perform calculations across related rows

  • Allow row-level analytics

This is extremely important.

Example question

What if we want:

Total spending per customer, but still show each individual order?

If you use GROUP BY:

SELECT customer_id,
SUM(amount) AS total_spent,
FROM orders
GROUP BY customer_id;
Enter fullscreen mode Exit fullscreen mode

You get

1 | 550
2 | 150
Enter fullscreen mode Exit fullscreen mode

But you lose individual orders.

Enter Window Functions

SELECT order_id,customer_id,amount,
SUM(amount) OVER (PARTITION BY customer_id) 
AS total_spent
FROM orders;
Enter fullscreen mode Exit fullscreen mode

Result:

1 | 1 | 250 | 550
2 | 1 | 300 | 550
3 | 2 | 150 | 150
Enter fullscreen mode Exit fullscreen mode

Now you have

  • Each order

  • AND total per customer

  • Without collapsing rows

That’s a better way to do it.

Understanding OVER()

The magic happens inside the OVER() clause.

PARTITION BY

Groups rows logically (like GROUP BY), but does not collapse them.

ORDER BY

Defines order within each partition(basically how you want them show).

Example: Ranking orders by amount.

SELECT order_id, customer_id,amount,
RANK() OVER (PARTITION BY customer_id ORDER BY amount DESC) 
AS customer_rank,
FROM orders;
Enter fullscreen mode Exit fullscreen mode

This ranks each customer’s orders separately.

Common Window Functions You Should Know

  • ROW_NUMBER()

Gives unique row numbers.

ROW_NUMBER() OVER (ORDER BY amount DESC)
Enter fullscreen mode Exit fullscreen mode
  • RANK()

Gives same rank for ties, skips numbers.

  • DENSE_RANK()

Gives same rank for ties, does NOT skip numbers.

  • SUM() OVER()

Running totals

SELECT order_id, amount,
SUM(amount) OVER (ORDER BY order_id) 
AS running_total
FROM orders;

Enter fullscreen mode Exit fullscreen mode
  • LAG() and LEAD()

Compare rows to the ones before or the ones after.

SELECT order_id, amount,
LAG(amount) OVER (ORDER BY order_id) 
AS previous_amount
FROM orders;
Enter fullscreen mode Exit fullscreen mode

Very useful for time-series analysis

Joins vs Window Functions

The Real Difference

Here’s the more clearer distinction

Joins

  • Combine tables
  • Increase columns
  • Used to bring related data
  • Based on keys

Window Functions

  • Analyze rows within a table
  • Add calculated insights
  • collapsing rows
  • Based on partitions and order

When Should You Use Each?

Use JOIN when-

  • You need data from multiple tables

  • You’re connecting facts and dimensions

  • You’re building analytical datasets

Use Window Functions when-

  • You need ranking

  • You need running totals

  • You need comparisons between rows

-You want aggregates without GROUP BY

In everyday analytics and data engineering, you often use BOTH together.

Example

SELECT customer_name, order_id,order_amount,
SUM(order_amount) OVER (PARTITION BY customer_id) AS total_spent
FROM customers
JOIN orders
ON customer_id = customer_id;
Enter fullscreen mode Exit fullscreen mode

This combines tables AND applies analytics

That’s production-level SQL.

In conclusion

Joins connect tables using keys.

Window functions perform analytics without collapsing rows.

GROUP BY reduces rows, window functions do not.

PARTITION BY is like GROUP BY, but keeps detail rows.

Modern data work heavily relies on window functions.

If you’re serious about becoming strong in SQL especially as a Data Engineer mastering both concepts is non-negotiable.

you can check more of my articles on https://www.linkedin.com/in/thomas-wangeci-065469194/

Top comments (0)