Ever tried retrieving the data you need from just one table but suddenly realised you need more detail or information about these data which you must get from another table? Joins to the rescue!!!
You can get that additional detail you need using the power of Joins. With Joins in SQL, you can retrieve or access the information you need from two or more tables.
Assuming this class maintains a database consisting of 3 tables such as person, grade, and activity. We would most likely have to get the information we need by looking up data in the person and grade tables using joins.
- The inner joins and
- The outer joins
This class has a database with the person and grade tables as below:
THE INNER JOIN (which is the most common type of join), also referred to as JOIN, is a type of JOIN that returns all rows from both participating tables where the key record of one table is equal to the key records of another table. Using an inner join on the person and grade table will return only students with grades like below:
In 2.md above, you can see that the inner join has combined both tables ON key columns (which has values) that are common to both tables. It then returns records that contain the selected columns in the SELECT clause - You can see in 1.md, that the id field for person table and person_id field for grade table matches for values of 33CC and 44DD only.
The code for completing an INNER JOIN from the person table to the grade table based on the common values of the key field is shown below:
In 3.sql and 2.md, please note the following:
- In the SELECT clause, the fields or columns you want to be returned from the tables are listed.
- The person table is conventionally called the left table because it is the first table in the SELECT clause.
- The grade table is conventionally called the right table because it is the table you are joining on.
- The INNER JOIN keyword can be replaced with the JOIN keyword - INNER JOIN is the default if you don't specify the type when you use the word JOIN.
- The ON clause introduces the key fields from each table we would be joining on. The ON clause is used to specify a join condition or a join-predicate. The name of a key field or key column can be the same or vary from one table to another - the key field name in the grade table, person_id, can also be named id like in the person table as long as it does not conflict with other column names in the grade table.
- Regardless of whether the value of the key field appears multiple times in a table, as far as it appears in both tables, the record would be included in the result e.g 33CC is returned twice for the year 2019 and 2020 respectively.
So, we can gladly use the inner join above and give the new teacher what he asked for.
An ad-hoc request comes in! The new teacher wants to see students with grades along with their yearly activities. Herein lies the awesomeness of SQL- the ability to combine multiple joins in a single query - Multiple Joins. Ring a bell? Let us see an example below:
In 4.md above, we have 3 tables - person, grade and activity - an additional table, activity, which shows the hobbies a student partake in yearly. So we came up with the query below:
In 5.sql above, activity.year i.e table_name.column_name is used because the field year is common to the grade and activity tables. If the table_name is not specified, an error would be thrown saying it is ambiguous. So, 5.sql returns:
Take a second look at this result! Do you observe something wrong with it? Some score field values are wrongly paired with activity and year values - Zamani for example, the 3rd and 4th rows are incorrect. This is so because we did not join on an extra key field, year, which is common to the activity and grade tables. So we modify the query to be:
which returns the desired result below:
The thought of how multiple inner joins work can be confusing right? Well, what actually happens like in 7.sql above is that every single join produces a single derived table which is then joined to the next table and on and on. Using the 7.sql example above:
JOIN 1: This is the inner join between the person table and the grade tables. Let’s call this result derived table one (DT1)
JOIN 2: This is another inner join between DT1 and the activity table. The result gotten here is the final result returned by this query
So having delivered what is needed in time, let us refactor what we have written for a healthy codebase. There are several ways the query in 7.sql could have been written. For example:
The AS keyword is used for creating an alias - an alias is a temporary name that only exists for the duration of the query. Good use cases for aliases are when:
- You want to make column names or table names more readable
- You want to write less because the names are long. - Example: there is more than one table in your query - so you write p.id = a.person_id instead of person.id = activity.person_id.
If the key field in grade and activity tables is also id (and not person_id), the SQL code would rather be:
You have seen the ON clause so far but if the key field you are joining on has the same name in both tables, use the USING keyword instead.
Stay tuned to this series for my next article on OUTER JOINS! Have an amazing and fulfilled week ahead!