Introduction:
For those who are new to MySQL, including myself, you might encounter unexpected results when you query from a table that has some null values. In MySQL, null is neither true nor false but unknown, and you can't compare null with null unless you use the 'IS NULL' or 'IS NOT NULL' operator. I found that the way MySQL treats null is not only interesting but also important to understand. Here, I'm demonstrating how null behaves using various MySQL functions!
Joining Tables with Columns That Contain Null Values
NULL with INNER JOIN vs. NULL with LEFT JOIN
Let's say there are two tables, as described below:
Table name : teachers
id | dept_id | name |
---|---|---|
101 | 1 | Shrivell |
102 | 1 | Throd |
103 | 1 | Splint |
104 | NULL | Spiregrain |
105 | 2 | Cutflower |
106 | NULL | Deadyawn |
Table name : departments
id | name |
---|---|
1 | Computing |
2 | Design |
3 | Engineering |
NULL | NULL |
Notice that there are null values in the teachers table. What would happen if I use an INNER JOIN to combine these two tables?
SELECT teachers.name, departments.name AS department
FROM teachers INNER JOIN departments
ON (teachers.dept_id = departments.id)
Here is the result
+-----------+------------+
| Name | Department |
+-----------+------------+
| Shrivell | Computing |
| Throd | Computing |
| Splint | Computing |
| Cutflower | Design |
+-----------+------------+
Notice that in the results, you don't see the names of two teachers whose department_ids are null.
But what about using a LEFT JOIN? Let's take a look.
Here is the query where I simply swap 'INNER' for 'LEFT'.
SELECT teachers.name, departments.name AS department
FROM teachers LEFT JOIN departments
ON (teachers.dept_id = departments.id)
Here is the result
+------------+------------+
| Name | Department |
+------------+------------+
| Shrivell | Computing |
| Throd | Computing |
| Splint | Computing |
| Spiregrain | NULL |
| Cutflower | Design |
| Deadyawn | NULL |
+------------+------------+
You'll notice that all the teachers are listed. INNER JOIN does not return records with null values, but LEFT JOIN does.
WHY?
Keep in mind that when using the equality operator (=) with NULL to NULL, the result is neither true nor false—it's unknown. Therefore, records with NULL values on both sides of the comparison are not included in the result set when using INNER JOIN. However, by the nature of LEFT JOIN, it selects ALL records from the left side table (which is the teachers table in this case), even if the dept_id is null. Unlike INNER JOIN, which would remove such records from the list, LEFT JOIN retains them.
EXISTS & NULL and IN & NULL
Both EXISTS and IN do not return records with null values.
Let's consider the example where I have these tables.
Table name: customers
customer_id | first_name | last_name | phone_number |
---|---|---|---|
1 | John | Doe | 123-456-7890 |
2 | Jane | Smith | 987-654-3210 |
3 | Alice | Johnson | NULL |
4 | Bob | Brown | 555-555-5555 |
Table name: customers_2
customer_id | first_name | last_name | phone_number |
---|---|---|---|
1 | John | Doe | 123-456-7890 |
2 | Jane | Smith | NULL |
3 | Alice | Johnson | 111-222-3333 |
4 | Bob | Brown | NULL |
Here is a query:
Using EXISTS to check if a name in b_table exists in a_table.
SELECT * FROM customers AS c1
WHERE EXISTS(
SELECT * FROM customers_2 AS c2
WHERE c1.first_name = c2.first_name
AND c1.last_name = c2.last_name
AND c1.phone_number = c2.phone_number);
Using IN to see if name in b_table is in a_table
SELECT *
FROM customers AS c1
WHERE (c1.first_name, c1.last_name, c1.phone_number) IN (
SELECT c2.first_name, c2.last_name, c2.phone_number
FROM customers_2 AS c2
);
Both returns the same result
+-------------+------------+-----------+--------------+
| customer_id | first_name | last_name | phone_number |
+-------------+------------+-----------+--------------+
| 1 | John | Doe | 123-456-7890 |
+-------------+------------+-----------+--------------+
Notice that records with null values are not returned.
WHY?
Because with EXISTS, you can't link columns that are null. For IN, it ignores null values as if they don't exist. Thus, both treat nulls in the same way: neither returns nulls.
** HOWEVER, it's not the same story with NOT EXISTS and NOT IN when dealing with NULL!**
Let's take a look.
I simply swapped EXISTS for NOT EXISTS.
SELECT * FROM customers AS c1
WHERE NOT EXISTS(
SELECT * FROM customers_2 AS c2
WHERE c1.first_name = c2.first_name
AND c1.last_name = c2.last_name
AND c1.phone_number = c2.phone_number);
Here is the result
| customer_id | first_name | last_name | phone_number |
|-------------|------------|-----------|---------------|
| 2 | Jane | Smith | 987-654-3210 |
| 3 | Alice | Johnson | [NULL] |
| 4 | Bob | Brown | 555-555-5555 |
Notice that it is returning records with null values. Both Jane and Bob's phone numbers in the customers_2 table are null, but since both phone numbers have values in the customers table, the returned values are not null.
How about using NOT IN?
SELECT *
FROM customers AS c1
WHERE (c1.first_name, c1.last_name, c1.phone_number) NOT IN (
SELECT c2.first_name, c2.last_name, c2.phone_number
FROM customers_2 AS c2
);
Here is the result
+-------------+------------+-----------+--------------+
| customer_id | first_name | last_name | phone_number |
+-------------+------------+-----------+--------------+
Nothing is returned!
Why does NOT EXISTS return null values but NOT IN does not? It's because they both treat null differently. NOT EXISTS is simply the opposite of EXISTS. It returns EVERYTHING that is not returned by EXISTS, including nulls.
However, NOT IN returns records that evaluate to TRUE, but in MySQL, null is neither TRUE nor FALSE—it is unknown.
Let's take a look at the record for Jane Smith with Id2.
When NOT IN is used, it compares:
Jane != Jane OR Smith != Smith OR 987-654-3210 != null.
Jane != Jane -> False
Smith != Smith -> False
987-654-3210 != null -> NULL
False or False or Null evaluates to NULL, so the record doesn't get returned!
Here's a little twist:
What if the first name in the customers table does not match the first name in the customers_2 table?
Let’s compare:
Jane Smith 987-654-3210 (from the customers table) and Jack Smith null (from the customers_2 table).
Here is the result
+-------------+------------+-----------+--------------+
| customer_id | first_name | last_name | phone_number |
+-------------+------------+-----------+--------------+
| 2 | Jack | Smith | 987-654-3210 |
What? You see Jack! Doesn't NOT IN fail to evaluate null as either false or true, so the record shouldn't be returned?
Let's analyze it:
Jane != Jack -> True
Smith != Smith -> False
987-654-3210 != null -> NULL
True or False or Null actually returns:
True || False -> True
True || NULL -> True
Therefore, the record gets returned!
If you want to understand how true, false, and null are evaluated, here is the link:
https://dev.mysql.com/doc/refman/8.0/en/logical-operators.html#operator_and
As you can see, it can get quite confusing when it comes to comparing null values.
At the end of the day, I think it's best to explicitly compare null values using IS NULL or IS NOT NULL, which returns true or false as shown below.
SELECT * FROM customers AS c1
WHERE EXISTS(
SELECT * FROM customers_2 AS c2
WHERE c1.first_name = c2.first_name
AND c1.last_name = c2.last_name
AND c1.phone_number = c2.phone_number OR (c1.phone_number IS NULL AND c2.phone_number IS NULL));
Conclusion
Handling null values in MySQL can be quite challenging, requiring careful attention when they are present in your database. It's crucial to conduct explicit null checking to clearly define how null values should be treated. By implementing the strategies I mentioned earlier, such as using IS NULL or IS NOT NULL, you can ensure more predictable and accurate query results.
Top comments (0)