Python · SQL · Web Dev · Java · AI/ML tracks launching soon — your one platform for all of IT
Intermediate

LEFT and RIGHT JOIN

Keep unmatched rows — every LEFT JOIN pattern from basic optional enrichment to anti-join gap analysis, plus when RIGHT JOIN makes sense and the ON vs WHERE trap

14–20 min April 2026
Section 7 · Joins
Joins · 6 modulesModule 32

// Part 01

The Core Difference — Inclusion of Unmatched Rows

INNER JOIN keeps only rows that match on both sides. But many analytical questions specifically need rows that have no match on one side — customers who have never ordered, products that have never been sold, stores with no employees. These questions cannot be answered with INNER JOIN because unmatched rows are excluded by definition.

LEFT JOIN solves this. It keeps every row from the left table — whether or not it has a match in the right table. For rows that do have a match, right-side columns are populated normally. For rows with no match, right-side columns are filled with NULL. The left table is never filtered — every one of its rows appears in the result exactly once per matching right-side row (or once with NULLs if no match exists).

INNER JOIN

Returns rows where BOTH sides match. Unmatched rows from either side are silently dropped.

Example: Customers who have placed at least one order

LEFT JOIN

Returns ALL left-table rows. Right-side columns are NULL when no match exists. Left table is never filtered.

Example: ALL customers — including those with no orders

RIGHT JOIN

Returns ALL right-table rows. Left-side columns are NULL when no match. Mirror of LEFT JOIN.

Example: ALL orders — even those referencing deleted customers

// Part 02

LEFT JOIN — Basic Syntax and Behaviour

LEFT JOIN syntax
-- LEFT JOIN (same as LEFT OUTER JOIN — OUTER is optional)
SELECT columns
FROM   left_table  AS l
LEFT JOIN right_table AS r ON l.key = r.key;

-- For matched rows: both sides populated
-- For unmatched left rows: right-side columns = NULL
-- The left table is NEVER filtered — every row appears at least once
Loading FreshCart DB…
Ctrl + Enter to run
Loading FreshCart database in your browser…
Loading FreshCart DB…
Ctrl + Enter to run
Loading FreshCart database in your browser…
Loading FreshCart DB…
Ctrl + Enter to run
Loading FreshCart database in your browser…

// Part 03

The Anti-Join Pattern — LEFT JOIN + IS NULL

The most powerful LEFT JOIN pattern is the anti-join — finding rows in the left table that have no corresponding row in the right table. The structure is always the same: LEFT JOIN, then WHERE right_table.primary_key IS NULL. The IS NULL filter keeps only the rows where no match was found — the NULLs that LEFT JOIN added.

Anti-join pattern — always the same structure
SELECT l.*
FROM   left_table  AS l
LEFT JOIN right_table AS r ON l.key = r.key
WHERE  r.key IS NULL;    -- keep only unmatched left rows

-- r.key IS NULL means no row in right_table matched this left row
-- These are the rows INNER JOIN would have excluded

Customers who have never placed an order

Loading FreshCart DB…
Ctrl + Enter to run
Loading FreshCart database in your browser…

Products that have never been sold

Loading FreshCart DB…
Ctrl + Enter to run
Loading FreshCart database in your browser…

Employees with no store assignment

Loading FreshCart DB…
Ctrl + Enter to run
Loading FreshCart database in your browser…

🎯 Pro Tip

The anti-join (LEFT JOIN + IS NULL) is one of the five patterns every SQL analyst must have memorised. It answers "which X has no Y?" — and that question appears constantly in production: churned users, unsold inventory, incomplete data pipelines, orphaned records. Whenever a business question contains the words "never", "no", "missing", or "without", reach for LEFT JOIN + IS NULL.

// Part 04

LEFT JOIN for Optional Enrichment — Adding Data Where Available

Not all LEFT JOINs are anti-joins. A common pattern is optional enrichment — adding data from a right table where it exists, while keeping all left rows regardless. The right-side columns may be NULL for some rows and populated for others, and both are valid in the result.

Orders with optional delivery details

Loading FreshCart DB…
Ctrl + Enter to run
Loading FreshCart database in your browser…

Products with sales metrics — zero for unsold

Loading FreshCart DB…
Ctrl + Enter to run
Loading FreshCart database in your browser…

Customers with loyalty tier benefits (optional reference table)

Loading FreshCart DB…
Ctrl + Enter to run
Loading FreshCart database in your browser…

// Part 05

The ON vs WHERE Trap with LEFT JOIN

This is the most important technical subtlety about LEFT JOIN — placing a filter in WHERE instead of ON accidentally converts the LEFT JOIN to an INNER JOIN. Understanding why prevents one of the most common silent query bugs in production SQL.

How the trap works

ON vs WHERE — the critical difference
-- GOAL: all customers, with only their DELIVERED orders
-- (unordered customers should still appear with NULL order columns)

-- WRONG: filter in WHERE — converts LEFT JOIN to INNER JOIN
SELECT c.customer_id, c.first_name, o.order_id, o.order_status
FROM customers AS c
LEFT JOIN orders AS o ON c.customer_id = o.customer_id
WHERE o.order_status = 'Delivered';
-- What happens:
-- 1. LEFT JOIN runs: all customers appear, NULLs for unordered
-- 2. WHERE filters: order_status = 'Delivered' on NULL rows = NULL = not TRUE
-- 3. NULL rows are discarded — unordered customers disappear
-- Result: same as INNER JOIN — unordered customers are EXCLUDED

-- RIGHT: filter in ON — applied before NULLs are added
SELECT c.customer_id, c.first_name, o.order_id, o.order_status
FROM customers AS c
LEFT JOIN orders AS o
  ON c.customer_id = o.customer_id
  AND o.order_status = 'Delivered';  -- filter within the join
-- What happens:
-- 1. JOIN only matches delivered orders (non-delivered excluded from matching)
-- 2. LEFT JOIN adds NULLs for customers with no delivered orders
-- 3. No WHERE to filter out the NULL rows
-- Result: all customers appear; order columns NULL if no delivered orders
Loading FreshCart DB…
Ctrl + Enter to run
Loading FreshCart database in your browser…
Loading FreshCart DB…
Ctrl + Enter to run
Loading FreshCart database in your browser…

Run both queries and compare the row counts. The WRONG version returns fewer rows — it excluded customers with no delivered orders. The RIGHT version returns all customers with 0 for those who have no delivered orders.

⚠️ Important
The ON vs WHERE trap is one of the top three SQL bugs that senior engineers look for in code review. The symptom: you use LEFT JOIN intending to keep all left rows, but the result has fewer rows than the left table. Diagnosis: check every WHERE condition for references to right-table columns — move them to the ON clause if you want to preserve all left rows.

The decision rule

QuestionCondition goes inWhy
Should unmatched left rows appear with NULLs when this condition fails?ONFilter is part of the join relationship — unmatched rows are still kept
Should unmatched left rows be excluded when this condition is NULL?WHEREIf exclusion is desired, WHERE is correct — but this converts LEFT to INNER
Is this condition on a left-table column only?WHERE (safe)Left-table conditions in WHERE never affect NULL row preservation
Is this condition on a right-table column?ON (for LEFT JOIN)Right-table conditions in WHERE kill NULL rows — move to ON to preserve them

// Part 06

Multiple LEFT JOINs — Chaining Optional Relationships

Multiple LEFT JOINs chain together just like INNER JOINs. Each LEFT JOIN independently preserves all rows from the preceding result — adding NULLs for each table that has no match.

Loading FreshCart DB…
Ctrl + Enter to run
Loading FreshCart database in your browser…
Loading FreshCart DB…
Ctrl + Enter to run
Loading FreshCart database in your browser…

Mixing INNER and LEFT JOINs

You can mix INNER JOIN and LEFT JOIN in the same query. INNER JOIN for required relationships (the match is guaranteed), LEFT JOIN for optional ones (the match may not exist).

Loading FreshCart DB…
Ctrl + Enter to run
Loading FreshCart database in your browser…

// Part 07

COALESCE with LEFT JOIN — Replacing NULLs

LEFT JOIN produces NULLs for unmatched rows. COALESCE replaces those NULLs with meaningful defaults — turning NULL into 0 for counts and sums, or into a label like 'No orders' for display.

Loading FreshCart DB…
Ctrl + Enter to run
Loading FreshCart database in your browser…
Loading FreshCart DB…
Ctrl + Enter to run
Loading FreshCart database in your browser…

// Part 08

RIGHT JOIN — The Mirror of LEFT JOIN

RIGHT JOIN keeps every row from the right table — whether or not it matches the left table. Left-side columns are NULL for unmatched right rows. RIGHT JOIN is the exact mirror of LEFT JOIN with the tables swapped.

In practice, RIGHT JOIN is almost never used in production code. Any RIGHT JOIN can be rewritten as a LEFT JOIN by simply swapping the table positions — and the LEFT JOIN version is universally considered more readable because most developers read queries left-to-right and expect the "anchor" table to appear first.

RIGHT JOIN — always rewritable as LEFT JOIN
-- RIGHT JOIN: keep all orders, left-side NULL if no matching customer
SELECT c.first_name, o.order_id, o.total_amount
FROM customers AS c
RIGHT JOIN orders AS o ON c.customer_id = o.customer_id;

-- Identical result as LEFT JOIN with tables swapped:
SELECT c.first_name, o.order_id, o.total_amount
FROM orders AS o
LEFT JOIN customers AS c ON o.customer_id = c.customer_id;

-- Both return: all orders, with customer name NULL if no customer found
-- The LEFT JOIN version is preferred — the anchor table (orders) is on the left

When RIGHT JOIN has legitimate use

RIGHT JOIN can be useful when refactoring a complex query where swapping tables is difficult — for example, when a CTE or subquery is already in the left position and adding a RIGHT JOIN to an existing chain avoids restructuring the whole query. It is also occasionally used in data pipeline code where the right table is always the "source of truth" that must be fully preserved.

Loading FreshCart DB…
Ctrl + Enter to run
Loading FreshCart database in your browser…

// Part 09

LEFT JOIN for Data Quality Checks

LEFT JOIN with IS NULL is the standard technique for finding data quality issues — orphaned records, missing references, incomplete data pipelines, and referential integrity violations that should have been caught by FK constraints.

Find orders referencing non-existent customers

Loading FreshCart DB…
Ctrl + Enter to run
Loading FreshCart database in your browser…

Find order items with no matching order

Loading FreshCart DB…
Ctrl + Enter to run
Loading FreshCart database in your browser…

Find stores with no employees assigned

Loading FreshCart DB…
Ctrl + Enter to run
Loading FreshCart database in your browser…

🎯 Pro Tip

Run LEFT JOIN + IS NULL checks on every table after a bulk data import or migration. These queries catch referential integrity violations that FK constraints would prevent in normal operation — but bulk imports sometimes bypass constraints. A 5-minute data quality check after any migration saves hours of debugging downstream when reports start showing wrong numbers.

// Part 10

LEFT JOIN vs NOT EXISTS vs NOT IN — Three Ways to Anti-Join

The anti-join (find rows with no match) can be written three ways. Each produces the same result for non-NULL values but differs in performance and NULL handling.

Method 1 — LEFT JOIN + IS NULL

Loading FreshCart DB…
Ctrl + Enter to run
Loading FreshCart database in your browser…

Method 2 — NOT EXISTS

Loading FreshCart DB…
Ctrl + Enter to run
Loading FreshCart database in your browser…

Method 3 — NOT IN

Loading FreshCart DB…
Ctrl + Enter to run
Loading FreshCart database in your browser…
MethodNULL safe?PerformanceReadabilityPreferred when
LEFT JOIN + IS NULLYesGood — uses index on join keyModerateDefault choice for anti-join
NOT EXISTSYesGood — short-circuits on first matchHighReadable intent, correlated check
NOT INNo — NULLs break itWorse on large listsHighOnly when subquery is guaranteed NULL-free
⚠️ Important
NOT IN is dangerous when the subquery can return NULL values. NOT IN (1, 2, NULL) is equivalent to NOT (col = 1 OR col = 2 OR col = NULL). Since col = NULL is always NULL (never TRUE), the entire expression evaluates to NULL for every row — returning zero rows. Always use LEFT JOIN + IS NULL or NOT EXISTS when the subquery column might contain NULLs.

// Part 11

What This Looks Like at Work

You are a data analyst at Sephora, India's beauty and personal care platform. The retention team runs a monthly lapsed-customer campaign. They need three lists: (1) customers who registered but never purchased, (2) customers who purchased at least once but not in the last 90 days, and (3) products that have been added to carts but never purchased. All three are anti-join or conditional LEFT JOIN queries — adapted here for FreshCart.

10:00 AM
List 1 — registered but never purchased
Classic anti-join: customers in the customer table with no row in orders.
Loading FreshCart DB…
Ctrl + Enter to run
Loading FreshCart database in your browser…
10:20 AM
List 2 — purchased before but lapsed (90+ days)
LEFT JOIN to get last order date, then filter for customers whose last order was 90+ days ago.
Loading FreshCart DB…
Ctrl + Enter to run
Loading FreshCart database in your browser…
10:45 AM
List 3 — products never sold (unsold inventory)
Anti-join on products vs order_items — products with no sales record.
Loading FreshCart DB…
Ctrl + Enter to run
Loading FreshCart database in your browser…
11:05 AM
All three lists delivered in 65 minutes
Three LEFT JOIN queries — two anti-joins and one HAVING filter — built and verified in 65 minutes. The retention team has three actionable segments: never-purchased for onboarding campaigns, lapsed for win-back campaigns, and unsold products for promotion planning.

🎯 Pro Tip

Retention analytics is almost entirely built on LEFT JOIN patterns. The questions "who hasn't done X?", "who did X but not recently?", and "what hasn't been Y?" all require keeping rows that have no match in the activity table. Build these three query patterns into your personal SQL library — you will write variations of them every week in product analytics.

// Part 12

Interview Prep — 5 Questions With Complete Answers

Q: What is a LEFT JOIN and how does it differ from INNER JOIN?

LEFT JOIN returns every row from the left table whether or not it has a matching row in the right table. For rows that have a match, right-side columns are populated with the matched values. For rows with no match, right-side columns are filled with NULL. INNER JOIN returns only rows where the join condition is satisfied in both tables — unmatched rows from either side are excluded.

The key behavioural difference: INNER JOIN filters the left table — only rows with a match survive. LEFT JOIN never filters the left table — every row appears at least once. With a customers-to-orders LEFT JOIN, every customer appears in the result. With an INNER JOIN, only customers who have placed at least one order appear.

When to choose which: use INNER JOIN when you only care about rows that have related data on both sides — orders with known customers, order items with known products. The FK constraints in a well-designed schema mean INNER JOIN and LEFT JOIN produce the same result for required relationships. Use LEFT JOIN when you need all rows from the left table regardless of whether a match exists — all products including unsold, all customers including those who have never ordered, all stores including those with no employees. The presence of NULL values in the right-side columns after LEFT JOIN carries meaning: it signals the absence of a relationship that the query was looking for.

Q: Explain the LEFT JOIN + WHERE IS NULL pattern. What does it find?

The LEFT JOIN + WHERE IS NULL pattern is called an anti-join. It finds rows in the left table that have no corresponding row in the right table. The structure is: SELECT left.* FROM left_table LEFT JOIN right_table ON left.key = right.key WHERE right.key IS NULL.

Why it works: LEFT JOIN produces NULLs in all right-side columns for every left row that found no match. The WHERE right.key IS NULL filter then keeps only those NULL rows — the unmatched ones. Rows that did match have a real key value (not NULL) and are excluded by the IS NULL filter. The result is exclusively the left rows with no right-side match.

This pattern answers "which X has no Y?" — customers who have never ordered, products that have never been sold, employees without store assignments, stores with no employees, orders with no line items. It is one of the five SQL patterns every analyst must know. The anti-join is functionally equivalent to NOT EXISTS and NOT IN (for non-NULL cases), but LEFT JOIN + IS NULL is often the most readable and performs well with proper indexing on the join column. It appears constantly in retention analysis (lapsed users), inventory management (unsold stock), and data quality checks (orphaned records).

Q: Why does putting a WHERE condition on a right-table column convert a LEFT JOIN into an INNER JOIN?

LEFT JOIN adds NULL values to right-side columns for unmatched left rows. When a WHERE condition references a right-table column, it evaluates that condition on the NULL values. In SQL's three-valued logic, any comparison with NULL returns NULL — not TRUE and not FALSE. WHERE discards all rows where the condition does not evaluate to TRUE — which includes NULL. So rows with NULL right-side columns (the unmatched rows LEFT JOIN was supposed to preserve) are discarded by the WHERE filter.

Example: customers LEFT JOIN orders — all customers appear, with NULL order columns for unordered customers. Adding WHERE o.order_status = 'Delivered' evaluates 'Delivered' = NULL for unordered customers, which returns NULL, and WHERE discards those rows. The final result contains only customers with at least one delivered order — identical to INNER JOIN.

The fix: move the right-table filter to the ON clause. ON o.order_status = 'Delivered' is applied during the join itself, before NULL rows are added. It restricts which right-table rows can match, but does not filter the left table. Customers with no delivered orders still appear — they just have NULL order columns because no delivered order matched. The decision rule: for LEFT JOIN, any condition that references right-table columns and should preserve unmatched left rows belongs in ON. Only left-table conditions and aggregate-level conditions belong in WHERE.

Q: How do you count records including groups with zero counts using LEFT JOIN?

To count records per group including groups with a count of zero, use LEFT JOIN from the group dimension table to the fact table, then COUNT the fact table's primary key (not COUNT(*)), combined with GROUP BY on the dimension table columns.

The key is COUNT(fact_table.pk) — not COUNT(*). COUNT(*) counts all rows including the NULL rows that LEFT JOIN adds for unmatched groups — it would return 1 for groups with no facts, not 0. COUNT(fact_table.pk) counts non-NULL values of the fact table's primary key. For unmatched groups, the pk is NULL, so COUNT returns 0. For matched groups, COUNT returns the number of matched rows.

Example: to count orders per store including stores with no orders — SELECT s.store_id, s.city, COUNT(o.order_id) AS order_count FROM stores AS s LEFT JOIN orders AS o ON s.store_id = o.store_id GROUP BY s.store_id, s.city. Stores with no orders have o.order_id = NULL, so COUNT(o.order_id) = 0. If you use COUNT(*) instead, stores with no orders would incorrectly show 1 (the NULL row is still a row). The same principle applies to SUM — COALESCE(SUM(o.total_amount), 0) returns 0 instead of NULL for groups with no matching rows, since SUM of an empty set returns NULL.

Q: What is the difference between NOT IN, NOT EXISTS, and LEFT JOIN IS NULL for finding non-matching rows?

All three find rows in the left set that have no match in the right set, but they differ in NULL handling, performance, and readability. NOT IN is the most dangerous: NOT IN (subquery) returns zero rows if the subquery returns any NULL value — because NOT IN (1, 2, NULL) evaluates as NOT (col = 1 OR col = 2 OR col = NULL), and since col = NULL is always NULL, the entire expression is NULL for every row. This is a silent correctness bug. Only use NOT IN when you are absolutely certain the subquery column has no NULLs and will remain NULL-free.

NOT EXISTS is NULL-safe and usually the most semantically clear — "return rows where no matching row exists." The correlated subquery checks each left row individually. Most query optimisers convert NOT EXISTS to an efficient anti-join plan internally. NOT EXISTS short-circuits — it stops looking as soon as one match is found, making it efficient when matches are common.

LEFT JOIN + IS NULL is also NULL-safe and usually the most familiar pattern for analysts who think in terms of table joins. It produces an explicit NULL for unmatched rows that the IS NULL filter then selects. Performance is generally equivalent to NOT EXISTS on modern optimisers — both are translated to anti-join plans internally. LEFT JOIN + IS NULL has one practical advantage: it allows you to also SELECT the right-table columns (to see the NULLs themselves) which is useful for debugging. NOT EXISTS cannot return right-table columns at all since the subquery's SELECT is not referenced. Preference in production code: NOT EXISTS for maximum semantic clarity, LEFT JOIN + IS NULL for analytical queries where seeing the NULL columns is useful for verification.

// Part 13

Errors You Will Hit — And Exactly Why They Happen

LEFT JOIN returns same row count as INNER JOIN — unmatched rows are missing

Cause: A WHERE condition on a right-table column is filtering out the NULL rows that LEFT JOIN preserves for unmatched left rows. WHERE o.order_status = 'Delivered' on NULL order_status evaluates to NULL, which WHERE discards. The result excludes all customers with no delivered orders — identical to INNER JOIN behaviour.

Fix: Move right-table filter conditions from WHERE to the ON clause: LEFT JOIN orders AS o ON c.customer_id = o.customer_id AND o.order_status = 'Delivered'. The AND condition in ON restricts which right rows participate in matching, but does not filter the left table. Unmatched left rows (customers with no delivered orders) appear with NULL order columns. To diagnose: compare SELECT COUNT(*) FROM customers to the count from your LEFT JOIN query — if they differ, a WHERE condition is filtering out left rows.

Anti-join (LEFT JOIN IS NULL) returns zero rows — expected some unmatched rows

Cause: The LEFT JOIN is being accidentally converted to an INNER JOIN by a WHERE condition on a right-table column (see above), so no NULL rows survive to be caught by IS NULL. Alternatively, every left-table row genuinely does have a match in the right table — the anti-join correctly returns zero rows because there are no unmatched rows.

Fix: Verify by running the LEFT JOIN without the WHERE IS NULL filter: SELECT COUNT(*) FROM left AS l LEFT JOIN right AS r ON l.key = r.key WHERE r.key IS NULL. If this returns 0 but SELECT COUNT(*) FROM left returns rows, every left row has a match — the anti-join correctly found nothing. If the counts differ but your WHERE IS NULL query still returns 0, check for a secondary WHERE condition on right-table columns that is eliminating the NULLs before IS NULL can filter them.

COUNT(*) returns 1 for groups with no matches — expected 0

Cause: You used COUNT(*) after a LEFT JOIN. COUNT(*) counts all rows, including the NULL rows LEFT JOIN adds for unmatched groups. A store with no orders gets one NULL row in the joined result — COUNT(*) counts that row and returns 1, not 0.

Fix: Use COUNT(right_table.primary_key) instead of COUNT(*): COUNT(o.order_id) returns 0 for unmatched rows because order_id is NULL and COUNT(column) ignores NULLs. COALESCE is not needed for COUNT — it already returns 0 for all-NULL groups. For SUM and AVG on unmatched groups, wrap with COALESCE: COALESCE(SUM(o.total_amount), 0) returns 0 instead of NULL.

NOT IN subquery returns zero rows — even though unmatched rows clearly exist

Cause: The NOT IN subquery returns at least one NULL value. NOT IN (1, 2, NULL) evaluates as NOT (col = 1 OR col = 2 OR col = NULL). Since col = NULL is always NULL (not TRUE), the entire OR expression can never be definitively TRUE for any col value — so NOT (NULL) = NULL for every row. WHERE discards all NULL rows, returning zero results.

Fix: Replace NOT IN with LEFT JOIN + IS NULL or NOT EXISTS — both handle NULLs correctly. If you must use NOT IN, add a WHERE filter to the subquery: WHERE subquery_column IS NOT NULL — ensuring the subquery never returns NULLs. Verify: SELECT customer_id FROM orders WHERE customer_id IS NULL — if this returns any rows, your NOT IN subquery is returning NULLs and will silently return zero rows.

LEFT JOIN with GROUP BY produces duplicate rows — customer appears multiple times

Cause: A customer has multiple orders, so LEFT JOIN creates one row per (customer, order) pair. Without GROUP BY, each order for the same customer is a separate row — this is expected behaviour. If you then GROUP BY without listing all non-aggregate SELECT columns, the GROUP BY collapses inconsistently and produces unexpected duplicates.

Fix: Add GROUP BY for all non-aggregate columns in SELECT: GROUP BY c.customer_id, c.first_name, c.last_name, c.city, c.loyalty_tier. Use aggregate functions for order-level data: COUNT(o.order_id) for order count, SUM(o.total_amount) for total spend, MAX(o.order_date) for last order date. If you genuinely want one row per customer with aggregated order data, GROUP BY is required. If you want one row per order with customer details, do not GROUP BY — the multiple rows per customer are correct.

Try It Yourself

Write three queries using LEFT JOIN: (1) A query showing all product categories with their total delivered revenue and order count — including categories with zero sales. Show category, order_count, total_revenue (0 if none), and a 'sales_status' column: 'Active' if revenue > 0, 'No Sales' if zero. (2) An anti-join finding all customers who joined in 2023 or later but have never placed a delivered order. Show customer_id, full name, email, joined_date. (3) A query showing each store's employee count — including stores with no employees (show 0). Show store_id, city, employee_count, and avg_salary (NULL is fine for empty stores).

🎯 Key Takeaways

  • LEFT JOIN keeps every row from the left table. Right-side columns are NULL for rows with no match. The left table is never filtered — every row appears at least once in the result.
  • Anti-join pattern: LEFT JOIN + WHERE right_table.pk IS NULL finds left rows with no match. This is the standard SQL idiom for "which X has no Y?" — customers with no orders, products never sold.
  • The ON vs WHERE trap: a WHERE condition on a right-table column converts LEFT JOIN to INNER JOIN by filtering out NULL rows. Move right-table filters to the ON clause to preserve unmatched left rows.
  • Left-table conditions in WHERE are always safe — they never affect NULL row preservation. Only right-table conditions in WHERE cause the trap.
  • COUNT(right_table.pk) returns 0 for unmatched groups. COUNT(*) returns 1 — it counts the NULL row. Always use COUNT(column) not COUNT(*) when counting after a LEFT JOIN with GROUP BY.
  • COALESCE wraps NULL aggregate results from LEFT JOIN: COALESCE(SUM(col), 0) and COALESCE(AVG(col), 0) replace NULL with zero for cleaner display.
  • RIGHT JOIN is LEFT JOIN with tables swapped — always rewritable as LEFT JOIN with the anchor table on the left. Prefer LEFT JOIN for readability.
  • NOT IN is dangerous when the subquery can return NULLs — it silently returns zero rows. Use LEFT JOIN + IS NULL or NOT EXISTS instead.
  • Multiple LEFT JOINs chain independently — each preserves all rows from the preceding result. INNER JOIN and LEFT JOIN can be mixed in the same query.
  • LEFT JOIN + IS NULL is the essential tool for data quality checks: orphaned records, missing references, incomplete pipeline outputs — all are found with this pattern.

What comes next

In Module 33, you learn FULL OUTER JOIN — combining both LEFT and RIGHT behaviour to keep all rows from both tables, with real use cases for reconciliation, gap analysis, and symmetric difference queries.

Module 33 → FULL OUTER JOIN
Share

Discussion

0

Have a better approach? Found something outdated? Share it — your knowledge helps everyone learning here.

Continue with GitHub
Loading...