SQL JOIN NULL Handling — Avoiding Unexpected Results
Understand how NULL values affect SQL JOINs. Learn why NULLs never match, how they propagate through outer joins, and techniques for NULL-safe comparisons.
Detailed Explanation
NULLs and JOINs: The Hidden Trap
NULL in SQL represents an unknown value. It does not equal anything — not even another NULL. This has critical implications for join behavior.
NULL Never Matches in a Join Condition
-- These NULLs will NOT match each other
Table A: id = NULL
Table B: id = NULL
-- a.id = b.id evaluates to NULL (not TRUE), so no join occurs
This means:
- INNER JOIN: Rows with NULL join keys are excluded from results.
- LEFT/RIGHT JOIN: NULL-key rows appear but never find a match on the other side.
- NOT IN: A single NULL in the subquery can make the entire result empty.
The NOT IN NULL Trap
-- If orders.customer_id contains any NULL:
SELECT * FROM customers
WHERE id NOT IN (SELECT customer_id FROM orders);
-- Returns ZERO rows! Not the expected anti-join result.
This happens because 5 NOT IN (1, 2, NULL) evaluates to UNKNOWN, which SQL treats as FALSE. Always use NOT EXISTS or add WHERE customer_id IS NOT NULL.
NULL-Safe Comparison Operators
Different databases offer NULL-safe equality operators:
| Database | Operator | Example |
|---|---|---|
| PostgreSQL | IS NOT DISTINCT FROM |
a.id IS NOT DISTINCT FROM b.id |
| MySQL | <=> |
a.id <=> b.id |
| SQLite | IS |
a.id IS b.id |
COALESCE Workaround
When NULL-safe operators are not available or not readable, use COALESCE to convert NULLs to a sentinel value:
ON COALESCE(a.region, '__NONE__') = COALESCE(b.region, '__NONE__')
Caution: This can prevent index usage and produce false matches if the sentinel value exists in your data.
Best Practice
- Declare join columns as
NOT NULLin the schema whenever possible. - Use
NOT EXISTSinstead ofNOT INfor anti-joins. - Explicitly handle NULLs in
WHEREclauses after outer joins. - Test your queries with NULL values in the join columns to catch surprises.
Use Case
Review NULL handling whenever a JOIN returns fewer or more rows than expected. NULLs in join columns are the most common cause of mysterious query results, especially with NOT IN subqueries.