Working with NULL Values
NULL is not zero, not empty, not false — it is the absence of data. Master three-valued logic, IS NULL, COALESCE, NULLIF, and every NULL trap in SQL
// Part 01
What NULL Actually Is — The Concept That Breaks Beginners
NULL is the single most misunderstood concept in SQL. Most beginners assume NULL means zero, or empty string, or false. It means none of those things. NULL means the absence of a value — the information is unknown, missing, or not applicable.
Think of it this way. You are filling out a form and a field asks for your middle name. If you have no middle name, you write "N/A" or leave it blank. If you do not know whether you have a middle name (unlikely, but humour the example), the field is genuinely unknown. In a database, both situations — not applicable and unknown — are represented by NULL. The field contains no value. Not an empty value. No value at all.
In FreshCart's orders table, the delivery_date column is NULL for orders that have not been delivered yet. It is not that the delivery date is unknown — it is that the delivery has not happened, so the date does not exist yet. The employees table has manager_id as NULL for store managers who report to nobody in this dataset. Not an unknown manager — no manager relationship applies.
// Part 02
Three-Valued Logic — Why NULL Breaks Normal Comparisons
In everyday life, every statement is either true or false. SQL uses three-valued logic: TRUE, FALSE, and NULL (unknown). This is the root cause of every NULL surprise you will ever encounter.
When you compare any value with NULL using a standard comparison operator, the result is always NULL — never TRUE, never FALSE. This is because comparing something known to something unknown cannot produce a definitive answer. "Is Seattle equal to unknown?" — we cannot know. The answer is unknown (NULL), not false.
The WHERE clause discards rows where the condition evaluates to NULL — only TRUE rows are kept. This means any WHERE condition involving NULL silently discards the row. This is the source of the most common SQL data quality bugs: rows disappearing from results because a NULL value caused a condition to return NULL instead of TRUE.
Demonstrating the NULL trap
// Part 03
NULL in Arithmetic — How NULL Propagates Through Calculations
NULL propagates through arithmetic. Any calculation that involves NULL produces NULL as the result. This is mathematically consistent — if one input is unknown, the output must also be unknown.
This propagation means that aggregate functions like SUM, AVG, MIN, and MAX all ignore NULL values — they only operate on non-NULL values. This is usually the correct behaviour, but it can produce surprising results if you are not aware of it.
// Part 04
NULL in Logical Operations — AND, OR, NOT with NULL
NULL in logical operations produces results that follow three-valued logic. You saw the truth tables for AND, OR, and NOT in Module 07. Here is a focused review of how NULL interacts with each:
AND with NULL
| A | B | A AND B |
|---|---|---|
| TRUE | NULL | NULL |
| FALSE | NULL | FALSE |
| NULL | NULL | NULL |
Key insight: FALSE AND NULL = FALSE. If one side is definitely false, the AND result is false regardless of the unknown. But TRUE AND NULL = NULL — a known true combined with an unknown gives an unknown result.
OR with NULL
| A | B | A OR B |
|---|---|---|
| TRUE | NULL | TRUE |
| FALSE | NULL | NULL |
| NULL | NULL | NULL |
Key insight: TRUE OR NULL = TRUE. If one side is definitely true, the OR result is true regardless of the unknown. But FALSE OR NULL = NULL — a known false combined with an unknown gives an unknown result.
NOT with NULL
| A | NOT A |
|---|---|
| TRUE | FALSE |
| FALSE | TRUE |
| NULL | NULL |
NOT NULL = NULL. Reversing an unknown still gives an unknown. This is why WHERE NOT city = 'Seattle' does not include rows where city is NULL.
// Part 05
IS NULL and IS NOT NULL — The Correct Way to Filter NULL
IS NULL and IS NOT NULL are the only operators that correctly handle NULL. They are specifically designed for three-valued logic — they return TRUE or FALSE, never NULL.
Finding rows with NULL in specific columns
Combining IS NULL with other conditions
// Part 06
COALESCE — Replace NULL with a Default Value
COALESCE is one of the most useful functions in SQL. It takes two or more arguments and returns the first non-NULL value from the list. If all arguments are NULL, it returns NULL.
COALESCE for displaying NULL as a readable value
COALESCE for default values in calculations
COALESCE for fallback column chains
COALESCE shines when you have a priority chain: use the first value if it exists, fall back to the second, then the third. This is common when merging data from multiple sources.
// Part 07
NULLIF — Turn a Value Into NULL
NULLIF is the inverse of COALESCE. It takes two arguments and returns NULL if they are equal, otherwise returns the first argument. This is most useful for preventing division by zero and for treating sentinel values (like empty string or zero) as NULL.
NULLIF to prevent division by zero
NULLIF to treat empty strings as NULL
In many systems, empty string and NULL are used interchangeably for "no value." NULLIF lets you normalise them.
// Part 08
NULL in ORDER BY, GROUP BY, and DISTINCT
NULL in ORDER BY
You learned this in Module 08: in PostgreSQL and DuckDB, NULLs sort last in ASC and first in DESC. Use NULLS FIRST or NULLS LAST to control this. NULL values are treated as a special sort value — not greater than or less than any real value, but consistently placed by the database.
NULL in GROUP BY
NULL is treated as a distinct grouping value in GROUP BY — all NULL rows are grouped together into a single NULL group. This is one of the few places where NULL = NULL is TRUE in SQL's practical behaviour.
NULL in DISTINCT
DISTINCT also treats NULL as a single distinct value — multiple NULL rows collapse into one NULL in the DISTINCT result. Even though NULL ≠ NULL in comparisons, DISTINCT deduplication treats all NULLs as equivalent.
NULL in UNIQUE constraints
UNIQUE constraints have a special NULL behaviour: multiple rows can have NULL in a UNIQUE column. Since NULL ≠ NULL (comparisons always return NULL), no two NULLs are considered equal — so they do not violate uniqueness. This is consistent with three-valued logic but surprises many people.
// Part 09
NOT IN and NULL — The Dangerous Combination
This is one of the most dangerous NULL traps in SQL. If you use NOT IN with a subquery or list that contains even one NULL value, the entire query returns zero rows — silently, with no error.
Why NOT IN fails with NULL
NOT IN (val1, val2, NULL) is internally expanded to: NOT (col = val1 OR col = val2 OR col = NULL). The col = NULL part always evaluates to NULL. Since any OR involving NULL (when the other conditions are FALSE) evaluates to NULL, and NOT NULL = NULL, every row in the table evaluates to NULL and is discarded by WHERE.
// Part 10
Complete NULL Handling Toolkit
Here is every NULL-related function and operator you need, in one reference section.
| Function / Operator | What it does | Example | Result |
|---|---|---|---|
| IS NULL | Check if a value is NULL | WHERE delivery_date IS NULL | TRUE for NULL rows |
| IS NOT NULL | Check if a value is not NULL | WHERE delivery_date IS NOT NULL | TRUE for non-NULL rows |
| COALESCE(a, b, c) | Return first non-NULL value | COALESCE(discount_pct, 0) | 0 if discount_pct is NULL |
| NULLIF(a, b) | Return NULL if a equals b | NULLIF(unit_price, 0) | NULL if price is 0 |
| IS DISTINCT FROM | NULL-safe equality check | a IS DISTINCT FROM b | FALSE if both NULL |
| IS NOT DISTINCT FROM | NULL-safe inequality check | a IS NOT DISTINCT FROM b | TRUE if both NULL |
IS DISTINCT FROM — NULL-safe comparison
Standard equality (=) returns NULL when either side is NULL. IS DISTINCT FROM is a NULL-safe comparison that treats NULL as a known value — NULL IS DISTINCT FROM NULL returns FALSE (they are the same), and value IS DISTINCT FROM NULL returns TRUE (they are different).
// Part 11
What This Looks Like at Work
You are a data engineer at HDFC Bank. The compliance team runs monthly reports on loan applications. They discover that some loan application records have NULL values in fields that should always have data — specifically the annual_income field. A NULL income in a loan application is a critical data quality issue that could cause incorrect risk scoring.
🎯 Pro Tip
In any system that handles financial, medical, or legal data — always make critical fields NOT NULL in the database schema. Application-level validation can be bypassed. A database NOT NULL constraint is the last line of defence and cannot be bypassed by any application, script, or bulk import. Design your schema with the assumption that application code will sometimes have bugs.
// Part 12
Interview Prep — 5 Questions With Complete Answers
NULL in SQL represents the complete absence of a value — it means the information is unknown, missing, or not applicable for that row. It is fundamentally different from zero, empty string, or false. Zero is a known numeric value. Empty string is a known string of zero length. False is a known boolean value. NULL is the absence of any known value.
The distinction matters enormously in queries. WHERE salary = 0 finds employees with a salary explicitly set to zero — perhaps interns or volunteers. WHERE salary IS NULL finds employees whose salary has not been recorded — perhaps newly added rows or data entry errors. These are completely different sets of rows. Similarly, WHERE name = '' finds rows where name is an empty string — someone entered nothing but submitted the form. WHERE name IS NULL finds rows where the name field was never touched at all.
In FreshCart's orders table, delivery_date IS NULL means the delivery has not happened yet — not that the delivery date is unknown, but that it does not exist because the event has not occurred. This is the "not applicable" use of NULL, distinct from the "unknown" use. Both are represented identically in SQL, which is why understanding what NULL means in context is important before writing queries that filter on nullable columns.
Three-valued logic extends standard two-valued (TRUE/FALSE) boolean logic by adding a third value: NULL (unknown). SQL uses three-valued logic because relational databases store real-world information that is sometimes incomplete or unknown. Forcing every value into TRUE or FALSE when the information is genuinely unknown would produce incorrect results — treating "unknown" as either "yes" or "no" introduces false precision.
The rules of three-valued logic follow from the meaning of NULL as "unknown." Any comparison with NULL returns NULL, because comparing a known value to an unknown value cannot produce a definitive TRUE or FALSE answer. TRUE AND NULL = NULL (true combined with unknown is still uncertain). FALSE AND NULL = FALSE (false combined with anything is false). TRUE OR NULL = TRUE (true combined with anything is true). FALSE OR NULL = NULL (false combined with unknown is uncertain). NOT NULL = NULL (reversing unknown is still unknown).
The practical consequence: the WHERE clause discards rows where the condition evaluates to NULL — only rows where the condition is definitely TRUE are included. This is the correct behaviour logically, but it means developers must be explicit about NULL handling. A condition intended to match "anything that is not Seattle" written as WHERE city <> 'Seattle' silently excludes rows where city is NULL, because NULL <> 'Seattle' evaluates to NULL, not TRUE. The correct version that includes NULL cities is WHERE city <> 'Seattle' OR city IS NULL.
COALESCE(value1, value2, ..., valueN) evaluates its arguments left to right and returns the first non-NULL value. If all arguments are NULL, it returns NULL. It is the standard SQL function for providing default values when a column might be NULL and for implementing fallback logic when multiple columns might supply a value.
Real-world example 1 — default values in calculations: when calculating the discounted price of a product, if discount_pct is NULL (no discount applies), multiplying by NULL produces NULL for the entire calculation. COALESCE(discount_pct, 0) replaces NULL with 0 so the calculation works correctly: unit_price * (1 - COALESCE(discount_pct, 0) / 100). The result is the original unit_price when no discount applies, rather than NULL.
Real-world example 2 — contact information fallback: a CRM might have preferred_email, work_email, and personal_email as separate columns. To get the best available contact: COALESCE(preferred_email, work_email, personal_email, 'No email on file'). This returns the preferred email if set, falls back to work email, then personal email, and finally a literal string if none are available. This fallback chain pattern is extremely common in data integration and reporting where data comes from multiple sources with varying completeness.
NOT IN with a subquery that contains NULL values causes the entire WHERE clause to return NULL for every row — resulting in zero rows in the output. This happens silently with no error message, making it one of the most dangerous NULL traps in SQL.
The mechanism: NOT IN (val1, val2, NULL) is internally equivalent to NOT (col = val1 OR col = val2 OR col = NULL). For any row in the outer query, col = NULL always evaluates to NULL (from three-valued logic). When the final OR combines a NULL result with the other conditions, and those conditions are also NULL or FALSE, the OR produces NULL. NOT NULL = NULL. The WHERE clause sees NULL and discards the row. This happens for every single row in the outer table — returning zero results.
The safe alternatives are: NOT EXISTS (SELECT 1 FROM subquery WHERE subquery.col = outer.col) — this correctly handles NULL because EXISTS checks for the presence of rows, not value equality. Or, filter NULLs from the subquery explicitly: NOT IN (SELECT col FROM table WHERE col IS NOT NULL). In practice, NOT EXISTS is almost always preferable to NOT IN for correlated lookups because it handles NULL correctly, often performs better on large datasets, and its intent is clearer. This is a rule worth memorising: when you mean "no matching row exists," write NOT EXISTS, not NOT IN.
NULLIF(value1, value2) returns NULL if value1 equals value2, otherwise returns value1. It is the inverse of COALESCE — where COALESCE replaces NULL with a value, NULLIF converts a specific value into NULL. The most common use case is preventing division-by-zero errors by converting a zero denominator into NULL before division.
Division by zero example: calculating profit margin as (unit_price - cost_price) / unit_price would error if unit_price is zero. With NULLIF: (unit_price - cost_price) / NULLIF(unit_price, 0). If unit_price is zero, NULLIF returns NULL, and dividing by NULL returns NULL instead of raising an error. The result row shows NULL for the margin, which correctly communicates "this calculation is not applicable" rather than crashing the query.
Second use case — treating sentinel values as NULL: some legacy systems use 0 or empty string as a "no value" indicator instead of NULL. COALESCE(NULLIF(phone, ''), 'No phone') handles both cases — if phone is '' (empty string), NULLIF converts it to NULL, then COALESCE replaces NULL with 'No phone'. If phone is already NULL, NULLIF leaves it NULL and COALESCE replaces it. If phone is '+91-9876543210', NULLIF leaves it as-is and COALESCE returns it. The combination of NULLIF and COALESCE is a standard pattern for normalising data from sources that mix NULL and empty string to represent "no value."
// Part 13
Errors You Will Hit — And Exactly Why They Happen
🎯 Key Takeaways
- ✓NULL means the absence of a value — unknown, missing, or not applicable. It is not zero, not empty string, not false. It is the complete absence of any known value.
- ✓SQL uses three-valued logic: TRUE, FALSE, and NULL. Any comparison with NULL using standard operators (=, <>, >, <) returns NULL — never TRUE or FALSE.
- ✓The WHERE clause discards rows where the condition evaluates to NULL. This means NULL conditions silently exclude rows — the most common source of invisible data quality bugs in SQL.
- ✓The only correct operators for checking NULL are IS NULL and IS NOT NULL. Never use = NULL or <> NULL — they return zero rows, always.
- ✓NULL propagates through arithmetic: any calculation involving NULL returns NULL. SUM, AVG, MIN, MAX all ignore NULL values. COUNT(*) counts all rows; COUNT(column) counts only non-NULL values.
- ✓COALESCE(a, b, c) returns the first non-NULL value. Use it to replace NULL with default values in calculations and display — COALESCE(discount_pct, 0), COALESCE(delivery_date, 'Not yet delivered').
- ✓NULLIF(a, b) returns NULL if a equals b, otherwise returns a. Use it to prevent division by zero: value / NULLIF(denominator, 0).
- ✓NOT IN with a subquery that returns any NULL value produces zero rows silently. Always use NOT EXISTS instead, or add WHERE col IS NOT NULL to the subquery.
- ✓UNIQUE constraints allow multiple NULL values — each NULL is considered distinct from every other NULL in standard SQL.
- ✓GROUP BY and DISTINCT both treat NULL as a single distinct value — all NULL rows form one NULL group, and multiple NULLs collapse into one in DISTINCT results.
What comes next
In Module 12, you learn column calculations and arithmetic expressions — doing math directly inside SQL, working with operator precedence, and building computed columns that power real analytics.
Module 12 → Column Calculations — ArithmeticDiscussion
0Have a better approach? Found something outdated? Share it — your knowledge helps everyone learning here.