Recca Tsai

Posted on Apr 21 • Originally published at recca0120.github.io

Two SQL NULL Traps: Missing JOIN Rows and UNIQUE Constraints That Do Nothing

#sql #null #mysql #postgres

Originally published at recca0120.github.io

JOIN two tables on a nullable column and the rows disappear.
Insert two NULLs into a UNIQUE column and neither insert fails.
Both problems share the same root: SQL defines NULL as unknown, and unknown is never equal to anything — including another unknown.

What NULL Actually Means

In SQL, NULL means "unknown value" — not zero, not empty string, not false.

SELECT NULL = NULL;   -- Result: NULL, not TRUE
SELECT NULL IS NULL;  -- Result: TRUE

NULL = NULL returns NULL, not TRUE. This single rule is why JOINs and UNIQUE constraints behave unexpectedly with NULL.

Trap 1: JOIN on NULL Columns Returns No Rows

The Scenario

Two tables where orders.customer_id can be NULL. You want to JOIN them to find the customer for each order:

-- customers table
| id   | name  |
|------|-------|
| NULL | Alice |
| 1    | Bob   |

-- orders table
| id | customer_id | amount |
|----|-------------|--------|
| 1  | NULL        | 100    |
| 2  | 1           | 200    |

SELECT o.id, c.name, o.amount
FROM orders o
JOIN customers c ON o.customer_id = c.id;

Result:

| id | name | amount |
|----|------|--------|
| 2  | Bob  | 200    |

orders.id = 1 vanished, even though both orders.customer_id and customers.id are NULL.

Why

The INNER JOIN condition is o.customer_id = c.id. When both values are NULL, NULL = NULL evaluates to NULL, not TRUE. The row fails the join condition and gets filtered out.

Solutions

Option 1: IS NOT DISTINCT FROM (SQL standard, treats NULL = NULL as true)

-- PostgreSQL, SQL Server 2022+
SELECT o.id, c.name, o.amount
FROM orders o
JOIN customers c ON o.customer_id IS NOT DISTINCT FROM c.id;

NULL IS NOT DISTINCT FROM NULL evaluates to TRUE.

Option 2: COALESCE to replace NULL

-- Replace NULL with a sentinel value that will never appear as a real ID
SELECT o.id, c.name, o.amount
FROM orders o
JOIN customers c ON COALESCE(o.customer_id, -1) = COALESCE(c.id, -1);

Substitutes NULL with -1 so the = comparison succeeds. Choose a sentinel that can't be a real ID.

Option 3: LEFT JOIN with explicit NULL handling

SELECT o.id, c.name, o.amount
FROM orders o
LEFT JOIN customers c ON o.customer_id = c.id
WHERE o.customer_id IS NOT NULL
   OR (o.customer_id IS NULL AND c.id IS NULL);

Database support

Syntax	MySQL	PostgreSQL	SQLite	SQL Server
`IS NOT DISTINCT FROM`	✓ (8.0.17+)	✓	✓	✓ (2022+)
`<=>` NULL-safe equality	✓	✗	✗	✗
`COALESCE`	✓	✓	✓	✓

MySQL has its own NULL-safe equality operator:

-- MySQL only
SELECT o.id, c.name, o.amount
FROM orders o
JOIN customers c ON o.customer_id <=> c.id;

Trap 2: UNIQUE Constraint Allows Multiple NULLs

The Scenario

You design a users table with a nullable email column. Users don't have to provide an email, but if they do, it should be unique:

CREATE TABLE users (
  id    INT PRIMARY KEY,
  name  VARCHAR(100),
  email VARCHAR(255) UNIQUE  -- unique if provided, NULL if not
);

Insert two rows with email = NULL:

INSERT INTO users (id, name, email) VALUES (1, 'Alice', NULL);
INSERT INTO users (id, name, email) VALUES (2, 'Bob', NULL);

SQLite, PostgreSQL, SQL Server: both inserts succeed. No error.
MySQL (InnoDB): also succeeds.

The UNIQUE constraint does nothing for NULL values.

Why

A UNIQUE constraint works by checking whether the new value equals any existing value. Since NULL != NULL, every NULL is treated as a distinct unknown — multiple NULLs all pass the uniqueness check.

This is specified behavior, not a bug. SQLite, PostgreSQL, SQL Server, MySQL InnoDB, and Oracle all follow it.

Solutions

Option 1: Partial Index (PostgreSQL, SQLite)

Build a UNIQUE index only over non-NULL values:

-- PostgreSQL
CREATE UNIQUE INDEX users_email_unique
ON users (email)
WHERE email IS NOT NULL;

-- SQLite (3.8.9+)
CREATE UNIQUE INDEX users_email_unique
ON users (email)
WHERE email IS NOT NULL;

NULL values aren't indexed at all, so they bypass the uniqueness check. Non-NULL emails are still enforced as unique.

Option 2: MySQL Filtered Index

-- MySQL 8.0.13+
CREATE UNIQUE INDEX users_email_unique
ON users (email)
WHERE email IS NOT NULL;

Option 3: SQL Server Filtered Index

CREATE UNIQUE INDEX users_email_unique
ON users (email)
WHERE email IS NOT NULL;

Option 4: NOT NULL + empty string (not recommended)

-- Forces every row to have an email; use empty string for "not provided"
-- Problem: semantically ambiguous — empty string vs not provided
ALTER TABLE users MODIFY email VARCHAR(255) NOT NULL DEFAULT '';

Both Traps, One Root Cause

Both problems come from the same design principle: NULL means unknown, not empty.

Operation	NULL behavior	Outcome
`NULL = NULL`	Unknown whether equal	`NULL` (not TRUE)
`NULL IS NULL`	Checks for NULL	`TRUE`
`NULL IS NOT DISTINCT FROM NULL`	NULL-safe comparison	`TRUE`
JOIN `NULL = NULL`	Fails the join condition	Row filtered out
UNIQUE with multiple NULLs	Each NULL is a different unknown	All pass

Summary

When a JOIN is dropping rows, check whether the ON condition columns can be NULL. Use IS NOT DISTINCT FROM for a NULL-safe comparison, or MySQL's <=>.

For UNIQUE on a nullable column, a plain UNIQUE constraint isn't enough. Use a Partial Index (WHERE column IS NOT NULL) to enforce uniqueness only for non-NULL values.

When designing a schema, decide explicitly whether a column should be nullable. Don't make it nullable by default — NULL interacts with constraints and JOINs in ways that aren't obvious until something silently breaks.

DEV Community

Two SQL NULL Traps: Missing JOIN Rows and UNIQUE Constraints That Do Nothing

What NULL Actually Means

Trap 1: JOIN on NULL Columns Returns No Rows

The Scenario

Why

Solutions

Trap 2: UNIQUE Constraint Allows Multiple NULLs

The Scenario

Why

Solutions

Both Traps, One Root Cause

Summary

References

Top comments (0)