So, yes. Today's goal is to get the 30hr SQL Bootcamp completed (or at least as much as I can), I am following Data with Baara. Who gets a 5 on 5 for clearing my doubts before I even know I have them (what kind of superpower is that?).
context: So I already knew the bare basics of SQL as it was a course I had to take in my pre-final year, but it was not on par with what is required of a DE. So I revised the things that I already knew and speed ran through it.
Joins (the ones they don't tell you about)
If you are also a beginner like me who had not dwelled too deep, you may think there exists only four types of JOINs (the big four: LEFT, RIGHT, INNER, FULL) but no, because when have things ever been this simple. We have moreeee. And they are:
Right Anti, Left Anti, Full Anti and Cross Join.
Now, Anti Joins are interesting. The "anti" here is not anti-the-named-side — it is anti-the-join-itself. You are not excluding the left table, you are keeping only the rows from the left that found no match on the right. The lonely ones. The ones that got ghosted.
- Left Anti → rows in the left table with no match in the right
- Right Anti → rows in the right table with no match in the left
- Full Anti → the unmatched ones from both sides
And the fun part? They are not even a separate keyword in most SQL dialects. You get there by combining a JOIN with a WHERE clause filtering for NULLs. LEFT JOIN + WHERE right_table.id IS NULL = Left Anti. The pattern is the thing.
Cross Join on the other hand is a greedy little beast — it wants a pair of everything. Every row from table A matched with every row from table B. No conditions, no chill. 100 rows × 50 rows = 5,000 rows, just like that. Powerful when you need it, catastrophic when you forget it is running on a large table.
Set Operators (the rules nobody writes down in one place)
After joins, we moved to Set Operators — UNION, UNION ALL, INTERSECT and EXCEPT.
The concept is simple enough. But Baara being Baara, he made sure we actually understood the rules before touching the syntax. And there are more rules than you'd expect:
- Columns must match — you cannot combine apples and oranges
- Data types must match — apples and slightly-different-apples also don't work
- Same number of columns across both queries
- Order of columns matters — col1 of query one maps to col1 of query two, so if you mix up the order you mix up your data silently, which is the worst kind of wrong
- Column names come from the first query — so whatever you alias in query one is what shows up in your result
- ORDER BY only at the end, on the last query — you are sorting the final combined result, not each individual one
The difference between UNION and UNION ALL trips a lot of people — UNION deduplicates, UNION ALL keeps everything including duplicates. For DE work UNION ALL is usually what you actually want because deduplication is expensive and you often want to handle it explicitly downstream anyway.
Plan for tomorrow: Date & Time functions + keep pushing through the bootcamp.
See you on Day 02. 👋
Top comments (0)