DEV Community

Mohamed Idris
Mohamed Idris

Posted on

Part 06: JOINs, Connecting Tables

This post is part of the SQL: Zero to Ninja series.

You run a report and it shows user_id: 7 next to an order. Useful, but your boss wants the user's name, not a number. The name lives in a different table. So how do you pull two tables together into one answer? That is what a JOIN does, and once it clicks, half of SQL opens up.

The idea in one line

A JOIN matches rows from two tables using a shared id, so you can read facts from both in a single result.

Why are tables split up at all?

Look at our schema. A user has a name, email, and country. An order has a total and a status. Why not just stuff the user's name into every order row?

Because you would repeat it. If Sara places 50 orders, you would write "Sara" 50 times. Change her name once and you would have to fix 50 rows. Messy and easy to break.

So instead, the user lives once in users, and each order just holds a tiny pointer back to her:

users                      orders
-----                      ------
id | name                  id | user_id | total
 1 | Sara                   1 |    1     |  50
 2 | Omar                   2 |    1     |  30   <-- user_id 1 = Sara again
                            3 |    2     |  90   <-- user_id 2 = Omar
Enter fullscreen mode Exit fullscreen mode

That user_id column is the string that ties them together. orders.user_id points at users.id. A JOIN is just you following that string.

The metaphor: matching name tags between two guest lists

Imagine a party with two clipboards. List A is "people who RSVP'd." List B is "people who paid." Each person wears a name tag with an id.

To find who both RSVP'd and paid, you walk down List A, and for each person you flip to List B looking for the same id. When the tags match, you staple their two rows together. That stapled row is one row in your result.

The big question (same as the laundry and notebook posts): what do you do with people who appear on only one list? Your answer is the whole difference between join types.

INNER JOIN: only the matches

INNER JOIN keeps only rows that have a match on both sides. No match, no row.

SELECT users.name, orders.total
FROM users
INNER JOIN orders
  ON users.id = orders.user_id;
Enter fullscreen mode Exit fullscreen mode

The ON part is the matching rule: "an order belongs to a user when orders.user_id equals users.id."

users.id = orders.user_id
   |               |
   1  --- matches --- 1   keep
   1  --- matches --- 1   keep
   2  --- matches --- 2   keep
   3  --- no order ---     dropped (no match)
Enter fullscreen mode Exit fullscreen mode

So a user with no orders simply does not show up. Great for "show me users with orders," bad for "show me users without orders."

LEFT JOIN: keep everyone on the left

LEFT JOIN keeps all rows from the left table, even when the right side has no match. The empty spots get filled with NULL.

SELECT users.name, orders.total
FROM users
LEFT JOIN orders
  ON users.id = orders.user_id;
Enter fullscreen mode Exit fullscreen mode
name | total
-----+------
Sara |  50
Sara |  30
Omar |  90
Lina | NULL   <-- kept, even with no orders
Enter fullscreen mode Exit fullscreen mode

This is your tool for "find what is missing." Want users who never ordered? LEFT JOIN, then keep the NULL rows. We go deep on this in the bonus post understanding-sql-left-join.md, so check that out if you want the full story.

RIGHT JOIN and FULL OUTER JOIN (the short version)

RIGHT JOIN is just LEFT JOIN with the tables flipped. It keeps everything on the right. Honestly, most people pick LEFT and reorder their tables instead of thinking backwards. Keep your brain calm and stick with LEFT.

FULL OUTER JOIN keeps both sides, matched where it can, NULL where it cannot. Heads up: MySQL does not have it (Postgres does). In MySQL you fake it by combining a LEFT and a RIGHT with UNION.

Forget the ON and boom: a cartesian explosion

If you join two tables but forget the ON rule, the database does not error. It does something scary instead: it pairs every left row with every right row.

-- DANGER: no ON
SELECT users.name, orders.total
FROM users
JOIN orders;
Enter fullscreen mode Exit fullscreen mode

10 users and 1,000 orders gives you 10,000 rows of nonsense. That is a cartesian product, and it is almost never what you want. Always give your join an ON.

Joining three tables: what did each user buy?

Real questions often cross more than two tables. "What products did each user buy?" needs four: users -> orders -> order_items -> products. You just chain the joins, following the ids like stepping stones.

SELECT users.name, products.name AS product
FROM users
JOIN orders       ON users.id = orders.user_id
JOIN order_items  ON orders.id = order_items.order_id
JOIN products     ON order_items.product_id = products.id;
Enter fullscreen mode Exit fullscreen mode
users --(user_id)--> orders --(order_id)--> order_items --(product_id)--> products
Enter fullscreen mode Exit fullscreen mode

Each ON is one hop. Follow the trail and you can answer almost anything about the whole schema.

Gotchas

  1. Forgetting ON. No ON means a cartesian explosion. If your result has way too many rows, this is suspect number one.

  2. Ambiguous column names. Both users and products have a name column. If you just write SELECT name, the database does not know which one. Use users.name or give it an alias:

SELECT u.name AS user_name, p.name AS product_name
FROM users u
JOIN orders o      ON u.id = o.user_id
JOIN order_items i ON o.id = i.order_id
JOIN products p    ON i.product_id = p.id;
Enter fullscreen mode Exit fullscreen mode

Short aliases (u, o, p) keep long queries readable.

  1. Surprise duplicate rows. A one-to-many join repeats the left row once per match. If Sara has 3 orders, her name shows up 3 times. That is correct, not a bug, but it surprises people. If you only want one row per user, combine joins with GROUP BY from Part 05.

Recap

  • Tables are split so data is not repeated. Ids tie them back together.
  • INNER JOIN keeps only matching rows on both sides.
  • LEFT JOIN keeps every left row, filling missing matches with NULL.
  • RIGHT JOIN is LEFT flipped. FULL OUTER JOIN keeps both sides (no MySQL).
  • The ON rule is the match condition. Forget it and you get a cartesian explosion.
  • Chain joins to walk across many tables, one ON hop at a time.

Your turn

Using the schema, write a query that lists every user's name next to the names of the products they bought. Then ask yourself: which join keeps users who bought nothing, INNER or LEFT? If you can explain it to a friend, you have got it.

Next up: Part 07: INSERT, UPDATE, DELETE, where we stop just reading data and start changing it.

Top comments (1)

Collapse
 
edriso profile image
Mohamed Idris

Part 06 Practice: JOINs

Let's connect some tables. These use the shared schema (users, orders, products, order_items). Try each one before looking, and remember the key question for every join: who gets to go missing?


1. Show each order's total next to the buyer's name. (INNER JOIN)

Only orders that actually have a matching user.

Solution

SELECT users.name, orders.total
FROM users
INNER JOIN orders
  ON users.id = orders.user_id;
Enter fullscreen mode Exit fullscreen mode

Why: INNER JOIN keeps only rows that match on both sides. The ON rule links orders.user_id to users.id.


2. List every user and how much each order was, keeping users who never ordered. (LEFT JOIN)

Users with no orders should still appear.

Solution

SELECT users.name, orders.total
FROM users
LEFT JOIN orders
  ON users.id = orders.user_id;
Enter fullscreen mode Exit fullscreen mode

Why: LEFT JOIN keeps all rows from the left table (users). Users with no orders show up with NULL in the total column.


3. Find users who have never placed an order. (LEFT JOIN + IS NULL)

Show only the names of users with zero orders.

Solution

SELECT users.name
FROM users
LEFT JOIN orders
  ON users.id = orders.user_id
WHERE orders.id IS NULL;
Enter fullscreen mode Exit fullscreen mode

Why: LEFT JOIN keeps everyone, then WHERE orders.id IS NULL keeps only the rows where no order matched. This is the classic "find what is missing" trick.


4. Show each order's total with the buyer's name, using table aliases. (aliases)

Same as question 1, but use short aliases to keep it clean.

Solution

SELECT u.name, o.total
FROM users u
INNER JOIN orders o
  ON u.id = o.user_id;
Enter fullscreen mode Exit fullscreen mode

Why: Aliases (u, o) save typing and make long join queries easier to read. They matter even more once you join three or four tables.


5. What products did each user buy? (3-table join)

List each user's name next to the product names they purchased. Chain users to orders to order_items to products.

Solution

SELECT u.name AS user_name, p.name AS product_name
FROM users u
JOIN orders o      ON u.id = o.user_id
JOIN order_items i ON o.id = i.order_id
JOIN products p    ON i.product_id = p.id;
Enter fullscreen mode Exit fullscreen mode

Why: Each ON is one hop along the ids. Aliasing the two name columns avoids the ambiguous column name error, since both users and products have a name.


6. How many distinct products has each user bought? (join + GROUP BY)

Show each user's name and the count of different products they purchased, most varied buyers first.

Solution

SELECT u.name, COUNT(DISTINCT i.product_id) AS products_bought
FROM users u
JOIN orders o      ON u.id = o.user_id
JOIN order_items i ON o.id = i.order_id
GROUP BY u.id, u.name
ORDER BY products_bought DESC;
Enter fullscreen mode Exit fullscreen mode

Why: One-to-many joins create repeated rows, so you bring in GROUP BY from Part 05 to fold them back into one row per user. COUNT(DISTINCT ...) makes sure buying the same product twice only counts once.


Great job. JOINs are the moment SQL goes from "one table" to "the whole database." Keep practicing the "who can go missing" question and you will pick the right join every time. See you in Part 07.