DEV Community

Cover image for A (very) Short Look at How MySQL Joins Work
Seena Sabti
Seena Sabti

Posted on

A (very) Short Look at How MySQL Joins Work

MySQL uses Nested-Loop Join (NLJ) algorithms to evaluate joins. Simply put, it takes one row from the first table, then scans the next for matches. Add a third table and it loops again. In the worst case, you face O(n²) time complexity 😳. There is good news though, built in optimizations keep real-world performance much lower.

So How Does It Actually Work?

Imagine you have 3 tables that you are joining: t1, t

SELECT *
FROM t1
JOIN t2 ON t1.t2_id = t2.id
JOIN t3 ON t2.t3_id = t3.id;
Enter fullscreen mode Exit fullscreen mode

The pseudocode for the NLJ algorithm is roughly the following:

for each row in t1
    for each t2 matching t1.t2_id
        for each t3 matching t2.t3_id
            return row if join condition is met
Enter fullscreen mode Exit fullscreen mode

Each inner loop runs for every row of the outer loop. You can see how this could get out of hand fast if your tables are big, No WHERE clause, no indexes? Good luck.

How Does MySQL Optimize the NLJ Algorithm?

MySQL uses what the docs call "condition pushdown": the relevant parts of the where clause will be evaluated at each loop level. Meaning, the where condition of t2 will be evaluated on the t2 loop alongside the join condition and so on:

SELECT *
FROM t1
JOIN t2 ON t1.t2_id = t2.id
JOIN t3 ON t2.t3_id = t3.id
WHERE t2.status = 'active';
Enter fullscreen mode Exit fullscreen mode

Here, MySQL won’t join everything and then filter by t2.status. It will filter t2.status = 'active' while it’s looping through t2 and will only pass those resulting rows into the t3 check.

Indexes Save the Day (Most of the Time)

This is why adding selective indexes is so important for optimization, especially with joins. Indexes help MySQL skip over the rows that are irrelevant to the results. Without them, It’s scanning everything over and over.

But even with the best indexes, joins on massive tables are still going to be inefficient. So when you're designing your database, think about the future. Are these tables going to grow? Can you avoid complex joins later? Do you need to denormalize some of your tables? A little planning now saves a lot of pain down the road.

Top comments (0)