SQL — Module 06Beginner

Filtering Rows — WHERE Clause

Get only the rows you actually need — every comparison operator, every data type, and how the database evaluates filters internally

12–16 min April 2026

Section 2 · Reading Data — SELECT

Reading Data — SELECT · 10 modulesModule 6

Your Filtering Multiple Sorting Limiting Removing Working Column Renaming Pattern

// Part 01

Why WHERE Is the Most Important Clause in SQL

In Module 05 you learned SELECT and FROM — they let you read data from a table. But SELECT without WHERE returns every single row in the table. The customers table has 20 rows. The orders table at a real company has 500 million. Returning all 500 million rows every time you need to answer a question is not just slow — it is impossible. Your screen cannot display it, your network cannot transfer it, and your database server will collapse under the load.

WHERE is the clause that tells the database: "Only give me rows that satisfy this condition." It is the difference between "show me everything" and "show me exactly what I asked for." Every useful SQL query in a production system has a WHERE clause. Mastering WHERE is mastering SQL.

Here is WHERE in its simplest form:

Loading FreshCart DB…
-- Without WHERE: all 20 customers
SELECT first_name, last_name, city, loyalty_tier
FROM customers;Ctrl + Enter to run
Loading FreshCart database in your browser…

Loading FreshCart DB…
-- With WHERE: only customers in Seattle
SELECT first_name, last_name, city, loyalty_tier
FROM customers
WHERE city = 'Seattle';Ctrl + Enter to run
Loading FreshCart database in your browser…

Same table, same columns — but the second query returns only the rows where city equals 'Seattle'. The WHERE clause evaluated every row and kept only the ones where the condition was true. Everything else was discarded before the result was returned to you.

// Part 02

How WHERE Works Internally

Understanding what the database does when it processes a WHERE clause makes you a dramatically better SQL writer. It explains why some queries are fast and others are slow — and what you can do about it.

The evaluation process — row by row

When the database executes a WHERE clause, it goes through the table row by row. For each row, it evaluates your condition. The condition produces one of three results: TRUE (the row satisfies the condition — include it), FALSE (the row does not satisfy the condition — discard it), or NULL (the condition could not be evaluated because a value is missing — also discarded). Only rows where the condition evaluates to TRUE make it into the result.

This happens before SELECT — remember from Module 05 that the execution order is FROM → WHERE → SELECT. The database identifies all rows that match the WHERE condition first, then extracts the requested columns from only those rows. This is efficient: if WHERE filters 10 million rows down to 100, SELECT only has to process those 100.

Without an index — full table scan

If the column in your WHERE clause has no index, the database must read every single page of the table, check every row, and keep only the ones where the condition is true. This is called a full table scan. On a table with 500 million rows and no index, a full table scan takes minutes to hours.

With an index — direct lookup

If the column in your WHERE clause has an index (a B-tree), the database jumps directly to the matching rows without reading the whole table. Finding 100 rows out of 500 million with an index takes milliseconds — the same query without an index takes minutes. You will learn to create indexes in Module 46. For now, know that WHERE on a primary key or a properly indexed column is always fast regardless of table size.

💡 Note

In the FreshCart playground, all tables are small (under 120 rows) so every query is fast regardless of indexes. In real production databases with millions of rows, WHERE performance depends entirely on whether the filtered column has an index. Always think about indexes when writing WHERE clauses on large tables.

// Part 03

Comparison Operators — The Six Tools of Filtering

A WHERE condition uses a comparison operator to compare a column value against a test value. There are six comparison operators in SQL, and you will use all six regularly.

Operator	Meaning	Example	Returns rows where...
=	Equal to	city = 'Seattle'	city is exactly Seattle
<> or !=	Not equal to	city <> 'Seattle'	city is anything other than Seattle
>	Greater than	unit_price > 100	unit_price is more than 100
<	Less than	unit_price < 100	unit_price is less than 100
>=	Greater than or equal to	salary >= 50000	salary is 50000 or more
<=	Less than or equal to	salary <= 50000	salary is 50000 or less

Equality — = operator

Loading FreshCart DB…
-- Find all Gold tier customers
SELECT first_name, last_name, city, loyalty_tier
FROM customers
WHERE loyalty_tier = 'Gold';Ctrl + Enter to run
Loading FreshCart database in your browser…

Loading FreshCart DB…
-- Find the order with a specific order_id
-- Primary key lookup — the fastest possible query
SELECT order_id, customer_id, order_date, total_amount, order_status
FROM orders
WHERE order_id = 1007;Ctrl + Enter to run
Loading FreshCart database in your browser…

Not equal — <> operator

Loading FreshCart DB…
-- Find all orders that were NOT delivered
-- <> means "not equal to" — same as != in most databases
SELECT order_id, order_date, order_status, total_amount
FROM orders
WHERE order_status <> 'Delivered'
ORDER BY order_date;Ctrl + Enter to run
Loading FreshCart database in your browser…

Greater than and less than

Loading FreshCart DB…
-- Products priced above ₹200
SELECT product_name, category, brand, unit_price
FROM products
WHERE unit_price > 200
ORDER BY unit_price DESC;Ctrl + Enter to run
Loading FreshCart database in your browser…

Loading FreshCart DB…
-- Employees earning at or above ₹60,000
SELECT first_name, last_name, role, salary
FROM employees
WHERE salary >= 60000
ORDER BY salary DESC;Ctrl + Enter to run
Loading FreshCart database in your browser…

Loading FreshCart DB…
-- Orders placed before February 2024
SELECT order_id, order_date, total_amount, order_status
FROM orders
WHERE order_date < '2024-02-01'
ORDER BY order_date;Ctrl + Enter to run
Loading FreshCart database in your browser…

// Part 04

Filtering Different Data Types — Numbers, Text, and Dates

WHERE works on every data type, but the syntax for each type is slightly different. Getting the syntax wrong is one of the most common beginner mistakes.

Filtering numbers — no quotes

Number values in WHERE conditions are written as-is, without quotes. The database compares the column value numerically.

Number filters — no quotes

-- Correct — numbers without quotes
WHERE unit_price = 56
WHERE total_amount > 1000
WHERE quantity <= 3
WHERE salary >= 40000

-- Wrong — numbers with quotes are treated as text
WHERE unit_price = '56'     -- might work but is incorrect practice
WHERE total_amount > '1000' -- string comparison, not numeric

Filtering text — always use single quotes

Text (VARCHAR, CHAR, TEXT) values in WHERE conditions must be wrapped in single quotes. Double quotes are used for identifiers (column names, table names) in SQL — not for string values. Forgetting this is the most common syntax error beginners make.

Text filters — always single quotes

-- Correct — text values in single quotes
WHERE city = 'Seattle'
WHERE order_status = 'Delivered'
WHERE loyalty_tier = 'Platinum'
WHERE payment_method = 'UPI'

-- Wrong — double quotes are for identifiers, not values
WHERE city = "Seattle"     -- error in PostgreSQL, might work in MySQL
WHERE city = Seattle       -- error: Seattle treated as column name

String comparison is case-sensitive

In PostgreSQL (and DuckDB, which the playground uses), string comparisons are case-sensitive. 'Seattle' is not the same as 'bangalore' or 'BANGALORE'. The WHERE condition will only return rows where the value matches exactly — including case.

Loading FreshCart DB…
-- 'Seattle' (capital B) finds rows
SELECT first_name, city FROM customers WHERE city = 'Seattle';Ctrl + Enter to run
Loading FreshCart database in your browser…

Loading FreshCart DB…
-- 'bangalore' (lowercase) finds nothing in PostgreSQL
-- Because the actual stored values have capital B
SELECT first_name, city FROM customers WHERE city = 'bangalore';Ctrl + Enter to run
Loading FreshCart database in your browser…

To do a case-insensitive comparison in PostgreSQL, use the ILIKE operator or wrap the column in LOWER():

Loading FreshCart DB…
-- Case-insensitive comparison using LOWER()
-- Works in PostgreSQL, MySQL, and DuckDB
SELECT first_name, city
FROM customers
WHERE LOWER(city) = 'bangalore';Ctrl + Enter to run
Loading FreshCart database in your browser…

Filtering dates — use ISO 8601 format

Date values in WHERE conditions are written as strings in YYYY-MM-DD format, wrapped in single quotes. Always use this format — it is unambiguous across every database and every locale. Never use DD/MM/YYYY or MM-DD-YYYY in SQL queries.

Loading FreshCart DB…
-- Orders placed on a specific date
SELECT order_id, order_date, total_amount
FROM orders
WHERE order_date = '2024-01-15';Ctrl + Enter to run
Loading FreshCart database in your browser…

Loading FreshCart DB…
-- Orders placed in January 2024
-- >= first day of month AND < first day of next month
SELECT order_id, order_date, order_status, total_amount
FROM orders
WHERE order_date >= '2024-01-01'
  AND order_date <  '2024-02-01'
ORDER BY order_date;Ctrl + Enter to run
Loading FreshCart database in your browser…

Filtering booleans

Loading FreshCart DB…
-- Products currently in stock
SELECT product_name, category, unit_price
FROM products
WHERE in_stock = true
ORDER BY unit_price;Ctrl + Enter to run
Loading FreshCart database in your browser…

Loading FreshCart DB…
-- Products out of stock
SELECT product_name, category, unit_price
FROM products
WHERE in_stock = false;Ctrl + Enter to run
Loading FreshCart database in your browser…

// Part 05

WHERE with Calculated Expressions

The condition in WHERE does not have to compare a raw column to a fixed value. You can use calculations on both sides of the comparison. The database evaluates the expression for each row and checks whether the result satisfies the condition.

Loading FreshCart DB…
-- Products where profit margin is greater than 30%
-- The calculation runs for every row
SELECT
  product_name,
  unit_price,
  cost_price,
  ROUND((unit_price - cost_price) / unit_price * 100, 1)  AS margin_pct
FROM products
WHERE (unit_price - cost_price) / unit_price > 0.30
ORDER BY margin_pct DESC;Ctrl + Enter to run
Loading FreshCart database in your browser…

Loading FreshCart DB…
-- Orders where total_amount is more than 3x the average
-- This is a subquery — you'll learn these fully in Module 36
SELECT order_id, customer_id, total_amount, order_status
FROM orders
WHERE total_amount > (SELECT AVG(total_amount) * 3 FROM orders)
ORDER BY total_amount DESC;Ctrl + Enter to run
Loading FreshCart database in your browser…

⚠️ Important

Applying functions or calculations to the column side of a WHERE condition (e.g. WHERE YEAR(order_date) = 2024) can prevent the database from using an index on that column — because the index stores raw values, not the computed results. This is called a non-SARGable condition. Where possible, rewrite calculations to the value side: WHERE order_date >= '2024-01-01' AND order_date < '2025-01-01'. You will learn this in depth in Module 57 (Query Optimisation).

// Part 06

Filtering NULL Values — IS NULL and IS NOT NULL

NULL is the absence of a value. It is not zero, not an empty string, not false — it means "we do not know" or "this information does not apply." In FreshCart's orders table, delivery_date is NULL for orders that have not been delivered yet — we do not know the delivery date because it has not happened.

NULL has a special property that surprises almost every beginner: you cannot use = to check for NULL. The expression column = NULL does not return TRUE or FALSE — it returns NULL. And remember: WHERE discards rows where the condition is NULL. So WHERE delivery_date = NULL returns zero rows — always, no matter what the data contains.

The NULL trap — = NULL never works

-- WRONG: this returns zero rows even if delivery_date is NULL
WHERE delivery_date = NULL

-- CORRECT: use IS NULL
WHERE delivery_date IS NULL

-- CORRECT: use IS NOT NULL for the opposite
WHERE delivery_date IS NOT NULL

Loading FreshCart DB…
-- Orders that have NOT been delivered yet
-- delivery_date is NULL for undelivered orders
SELECT order_id, order_date, order_status, total_amount
FROM orders
WHERE delivery_date IS NULL
ORDER BY order_date;Ctrl + Enter to run
Loading FreshCart database in your browser…

Loading FreshCart DB…
-- Orders that HAVE been delivered
-- delivery_date IS NOT NULL for delivered orders
SELECT order_id, order_date, delivery_date,
       delivery_date - order_date  AS days_taken
FROM orders
WHERE delivery_date IS NOT NULL
ORDER BY days_taken DESC;Ctrl + Enter to run
Loading FreshCart database in your browser…

Loading FreshCart DB…
-- Employees with no manager (top-level managers)
-- manager_id is NULL for employees who report to nobody
SELECT employee_id, first_name, last_name, role, store_id
FROM employees
WHERE manager_id IS NULL;Ctrl + Enter to run
Loading FreshCart database in your browser…

🎯 Pro Tip

NULL confusion causes silent data quality bugs that are extremely hard to find. Whenever you write a WHERE condition on a column that might contain NULLs, ask yourself: "What do I want to happen with NULL rows?" If you want them included, you need to explicitly handle them. If you want them excluded, your regular condition already excludes them (remember: NULL condition = row discarded). Module 11 covers NULL in full depth — this is a preview of why it matters.

// Part 07

WHERE on Different FreshCart Tables — Real Scenarios

Let us practice WHERE across all six FreshCart tables with real business questions that would come up in day-to-day work.

Customers — targeting by tier and location

Loading FreshCart DB…
-- Platinum customers — the highest value segment
-- Marketing team uses this for exclusive campaign targeting
SELECT
  first_name, last_name, city, state, email, phone
FROM customers
WHERE loyalty_tier = 'Platinum';Ctrl + Enter to run
Loading FreshCart database in your browser…

Loading FreshCart DB…
-- Customers who joined after January 2023
-- Growth team tracks new cohorts
SELECT
  first_name, last_name, city, loyalty_tier, joined_date
FROM customers
WHERE joined_date >= '2023-01-01'
ORDER BY joined_date;Ctrl + Enter to run
Loading FreshCart database in your browser…

Products — filtering by price, category, and stock

Loading FreshCart DB…
-- Budget products under ₹50 that are in stock
-- Used for discount campaign eligibility
SELECT product_name, brand, unit_price, unit, category
FROM products
WHERE unit_price < 50
  AND in_stock = true
ORDER BY unit_price;Ctrl + Enter to run
Loading FreshCart database in your browser…

Loading FreshCart DB…
-- All dairy products
-- Category management team reviews this weekly
SELECT product_name, brand, unit_price, cost_price, in_stock
FROM products
WHERE category = 'Dairy'
ORDER BY unit_price;Ctrl + Enter to run
Loading FreshCart database in your browser…

Orders — operational filtering

Loading FreshCart DB…
-- High-value orders above ₹2,000 — flagged for review
-- Finance team monitors these for fraud screening
SELECT order_id, customer_id, store_id,
       order_date, payment_method, total_amount
FROM orders
WHERE total_amount > 2000
ORDER BY total_amount DESC;Ctrl + Enter to run
Loading FreshCart database in your browser…

Loading FreshCart DB…
-- Cancelled orders — the ops team investigates these
SELECT order_id, customer_id, store_id,
       order_date, payment_method, total_amount
FROM orders
WHERE order_status = 'Cancelled'
ORDER BY order_date;Ctrl + Enter to run
Loading FreshCart database in your browser…

Employees — HR and payroll filtering

Loading FreshCart DB…
-- Employees at the Seattle stores (ST001 and ST002)
-- Store operations manager uses this for scheduling
SELECT first_name, last_name, role, salary, hire_date
FROM employees
WHERE store_id = 'ST001'
ORDER BY salary DESC;Ctrl + Enter to run
Loading FreshCart database in your browser…

Loading FreshCart DB…
-- Senior employees hired before 2021
-- Used for long-service recognition programme
SELECT first_name, last_name, role, store_id, hire_date
FROM employees
WHERE hire_date < '2021-01-01'
ORDER BY hire_date;Ctrl + Enter to run
Loading FreshCart database in your browser…

// Part 08

Common WHERE Patterns You Will Write Every Day

In real work, certain WHERE patterns appear constantly. Recognising them makes you write queries faster and more accurately.

Exact match on a primary key or ID

The most common query in any application: look up one specific record by its ID. This is always fast because primary keys are always indexed.

ID lookup — fastest possible query

-- Find one specific customer
SELECT * FROM customers WHERE customer_id = 4;

-- Find one specific order
SELECT * FROM orders WHERE order_id = 1015;

-- Find one specific product
SELECT * FROM products WHERE product_id = 21;

Date range filter

Filtering by a date range — this month, last quarter, last 30 days — appears in almost every analytics query.

Date range patterns

-- This month (January 2024)
WHERE order_date >= '2024-01-01' AND order_date < '2024-02-01'

-- Last 30 days from today (use CURRENT_DATE)
WHERE order_date >= CURRENT_DATE - INTERVAL '30 days'   -- PostgreSQL
WHERE order_date >= DATE_SUB(CURRENT_DATE, INTERVAL 30 DAY) -- MySQL

-- Between two specific dates (inclusive on both ends)
WHERE order_date >= '2024-01-15' AND order_date <= '2024-02-15'

Status filter

Almost every transactional table has a status column. Filtering by status is one of the most frequent operations.

Status filtering

-- Find all delivered orders
WHERE order_status = 'Delivered'

-- Find all open/active records (not cancelled or completed)
WHERE order_status = 'Processing'

-- Find problem orders (cancelled or returned)
WHERE order_status <> 'Delivered'

Amount threshold

Filtering by a monetary amount — high-value transactions, orders above a minimum, products in a price range — is constant in financial and e-commerce analytics.

Amount thresholds

-- High-value orders
WHERE total_amount >= 1000

-- Affordable products
WHERE unit_price <= 100

-- Products in a specific price range
WHERE unit_price >= 50 AND unit_price <= 200

// Part 09

WHERE vs HAVING — A Critical Distinction

Both WHERE and HAVING filter rows, but they operate at completely different points in query execution and on completely different things. Confusing them causes errors that can be frustrating to debug if you do not understand the distinction.

WHERE filters individual rows — before any grouping or aggregation happens. It operates on the raw column values of each row.

HAVING filters groups — after GROUP BY has aggregated rows into groups. It operates on aggregate values like COUNT(), SUM(), AVG(). You cannot use HAVING without GROUP BY.

WHERE vs HAVING — the key difference

-- WHERE: filter individual orders before counting
-- "Count only delivered orders, grouped by store"
SELECT store_id, COUNT(*) AS delivered_count
FROM orders
WHERE order_status = 'Delivered'   -- filter rows BEFORE grouping
GROUP BY store_id;

-- HAVING: filter groups after counting
-- "Show only stores that had more than 3 orders total"
SELECT store_id, COUNT(*) AS total_orders
FROM orders
GROUP BY store_id
HAVING COUNT(*) > 3;               -- filter AFTER grouping

-- Both together:
-- "Stores with more than 2 DELIVERED orders"
SELECT store_id, COUNT(*) AS delivered_count
FROM orders
WHERE order_status = 'Delivered'   -- filter rows first
GROUP BY store_id
HAVING COUNT(*) > 2;               -- then filter groups

You will learn GROUP BY and HAVING in full depth in Modules 28 and 29. For now, remember this one rule: WHERE for rows, HAVING for groups.

// Part 10

Performance — Writing WHERE Clauses That Stay Fast

A WHERE clause that works correctly but runs slowly is still a problem in production. Here are the most important performance rules for WHERE — you will learn the full details in Module 57, but these habits should start now.

Filter on indexed columns

Primary keys are always indexed. Other columns may or may not have indexes — that depends on how the database was designed. When you write WHERE on a column that has an index, the database does an index lookup (fast). When you write WHERE on an unindexed column, the database does a full table scan (slow on large tables). Ask your team which columns are indexed when joining a new project.

Avoid functions on the left side of WHERE

Applying a function to a column in WHERE prevents index usage. The database's index stores raw column values — not the result of functions applied to them.

SARGable vs non-SARGable WHERE conditions

-- SLOW: function on the column side — cannot use index on order_date
WHERE YEAR(order_date) = 2024
WHERE MONTH(order_date) = 1
WHERE UPPER(city) = 'BANGALORE'

-- FAST: equivalent conditions that CAN use the index
WHERE order_date >= '2024-01-01' AND order_date < '2025-01-01'
WHERE order_date >= '2024-01-01' AND order_date < '2024-02-01'
WHERE city = 'Seattle'   -- or use a case-insensitive index

Put the most selective filter first when possible

The most selective filter is the one that eliminates the most rows. In most databases the query optimiser figures this out automatically, but writing your most restrictive condition first makes the query easier for humans to read and reason about — the first condition narrows the data most aggressively.

Most selective filter first

-- Better: order_id is the primary key — eliminates all but 1 row immediately
WHERE order_id = 1007
  AND order_status = 'Delivered'

-- Less ideal order — status filter is less selective than the ID filter
WHERE order_status = 'Delivered'
  AND order_id = 1007

🎯 Pro Tip

Every WHERE clause you write in a production environment should be accompanied by a mental question: "Does this column have an index?" If you are filtering on a column that millions of rows will be scanned against and there is no index, the query will be slow. The fix is adding an index — which you will learn to do in Module 46. For now, build the habit of asking the question.

// Part 11

What This Looks Like at Work

You are an analyst at Shopify, a Seattle-based social commerce platform. It is 2 PM on a Thursday. The customer support team sends an urgent message — a customer is complaining that their order was marked delivered but they never received it. They need you to pull everything about this specific order immediately.

2:05 PM

First query — find the order

The support team gives you order ID 9847321. You open your SQL client and run the single most useful query in any investigation: a primary key lookup.

Step 1 — primary key lookup

-- Find the specific order
SELECT *
FROM orders
WHERE order_id = 9847321;

2:06 PM

You see the problem

The order shows status = 'Delivered' and delivery_date = '2024-03-15', but the customer says they never got it. You need to see the customer's details and which store fulfilled it.

Step 2 — get customer and store context

-- Find this customer's other recent orders
-- to see if there is a pattern
SELECT order_id, order_date, delivery_date,
       order_status, total_amount, store_id
FROM orders
WHERE customer_id = 12847
  AND order_date >= '2024-01-01'
ORDER BY order_date DESC;

2:10 PM

Pattern found

You see that this customer has had 3 'Delivered' orders in the last month — all from the same store (ST007). You check that store's recent delivery data.

Step 3 — check the store's delivery pattern

-- All 'Delivered' orders from store ST007 in the last 2 weeks
-- Are there more complaints from this store?
SELECT order_id, customer_id, order_date,
       delivery_date, total_amount
FROM orders
WHERE store_id = 'ST007'
  AND order_status = 'Delivered'
  AND delivery_date >= '2024-03-08'
ORDER BY delivery_date DESC;

2:18 PM

Escalation with data

You find 12 orders from ST007 all marked delivered in the same 2-day window. You pull the list, attach it to the support ticket, and escalate to the operations team with a clear note: "Pattern of potentially false delivery marks from ST007, 12 orders affected, dates 2024-03-14 to 2024-03-15." What started as one customer complaint turned into a store-level operations issue — found in 13 minutes with three WHERE queries.

🎯 Pro Tip

This is real SQL work. Not complex analytics — just precise, targeted WHERE queries that narrow from "the whole database" to "the exact rows that explain the problem." The investigation above uses only what you have learned in Modules 01–06. This is how much SQL you need to be genuinely useful on day one.

// Part 12

Interview Prep — 5 Questions With Complete Answers

Q: What is the WHERE clause and when is it executed relative to SELECT?

The WHERE clause filters rows in a SQL query — it specifies conditions that each row must satisfy to be included in the result. Only rows where the WHERE condition evaluates to TRUE are passed forward. Rows where the condition is FALSE or NULL are discarded.

WHERE is executed before SELECT in the query execution order. The full execution order is: FROM (identify the source table), WHERE (filter rows), GROUP BY (group filtered rows), HAVING (filter groups), SELECT (compute output columns from remaining rows), ORDER BY (sort), LIMIT (cut to n rows). This sequence has important consequences: you cannot reference a SELECT alias in a WHERE clause because WHERE runs before SELECT and the alias does not yet exist. You cannot use aggregate functions (COUNT, SUM, AVG) in WHERE because aggregation happens after WHERE — use HAVING instead for filtering aggregate results.

Performance-wise, WHERE is the most important clause for query speed. An effective WHERE condition on an indexed column can reduce the rows the database processes from 500 million to 100 — a 5-million-fold reduction in work before SELECT, ORDER BY, or any other clause runs.

Q: Why can you not use = NULL to check for NULL values? What should you use instead?

NULL represents the absence of a value — it means "unknown" or "not applicable." In SQL, any comparison involving NULL returns NULL, not TRUE or FALSE. This is called three-valued logic. The expression column = NULL does not return TRUE for rows where column is NULL — it returns NULL. And the WHERE clause discards rows where the condition result is NULL (not just FALSE). So WHERE column = NULL will always return zero rows, regardless of what the data contains.

This behaviour is technically correct but surprises almost every SQL beginner. NULL means "we don't know the value" — and "we don't know" = "we don't know" does not logically produce a definitive TRUE answer, so SQL returns NULL instead.

To check for NULL, SQL provides the IS NULL and IS NOT NULL operators. These are specifically designed to handle the three-valued logic of NULL: IS NULL returns TRUE when the value is NULL, and IS NOT NULL returns TRUE when the value is anything other than NULL. WHERE delivery_date IS NULL finds all rows where delivery_date has not been set. WHERE delivery_date IS NOT NULL finds all rows where it has been set to a real date. These are the only correct ways to filter NULL values.

Q: What is the difference between WHERE and HAVING?

WHERE and HAVING both filter data, but they operate at different stages of query execution and on different things. WHERE filters individual rows before any grouping or aggregation. It operates on raw column values and runs early in the execution pipeline — immediately after FROM. HAVING filters groups after GROUP BY has aggregated rows, and it operates on aggregate values like COUNT(), SUM(), AVG(). HAVING runs after SELECT in the execution order.

A concrete example: to find the total revenue per city, but only for cities with more than ₹5,000 in delivered orders, you need both. WHERE order_status = 'Delivered' filters individual rows before the city grouping. HAVING SUM(total_amount) > 5000 filters the city groups after their revenue has been computed. You cannot use WHERE SUM(total_amount) > 5000 because SUM() does not exist yet when WHERE runs.

Performance difference: WHERE filters early and reduces the number of rows that GROUP BY must process. HAVING filters late after all aggregation is done. For performance, always move filtering to WHERE when possible and use HAVING only for conditions that genuinely require aggregate values. If you can express a filter in WHERE instead of HAVING, always do — it reduces the work done by GROUP BY.

Q: What is a full table scan and how does the WHERE clause affect it?

A full table scan is when the database reads every page of a table from start to finish to find rows that match a WHERE condition. On a small table this is fine. On a table with 500 million rows stored across thousands of disk pages, a full table scan can take minutes or longer — and blocks other queries from running efficiently.

Whether WHERE causes a full table scan depends on whether the filtered column has an index. If the column has a B-tree index, the database uses the index to jump directly to the matching rows without reading the whole table — an index lookup. This turns a minutes-long full scan into a sub-millisecond operation. If the column has no index, the database has no choice but to read every row.

WHERE conditions that prevent index usage include: applying a function to the column (WHERE UPPER(city) = 'BANGALORE' cannot use an index on city), using OR conditions across different columns without a composite index, and using leading wildcards in LIKE (WHERE email LIKE '%gmail.com' cannot use an index because the prefix is unknown). The solution for function-wrapped columns is to rewrite the condition: WHERE city = 'Seattle' (case-sensitive, uses index) or add a functional index on UPPER(city). Understanding when WHERE triggers a full scan versus an index lookup is the foundation of query optimisation.

Q: How does string comparison work in WHERE? Give an example of a case-sensitivity issue.

String comparison in WHERE checks whether a column value matches a specified string. In PostgreSQL, SQLite, and DuckDB (used in the Chaduvuko playground), string comparisons are case-sensitive by default — the exact characters must match including capitalisation. 'Seattle' does not equal 'bangalore' or 'BANGALORE'. In MySQL, string comparisons are case-insensitive by default because MySQL's default collation (utf8mb4_general_ci, where ci stands for case-insensitive) treats uppercase and lowercase as equivalent.

A real example of a case-sensitivity issue: an analyst queries WHERE city = 'bangalore' on a PostgreSQL database where cities are stored as 'Seattle' (capital B). The query returns zero rows — not because there are no customers in Seattle, but because 'bangalore' and 'Seattle' are treated as different strings. The analyst might incorrectly conclude the data is missing.

Solutions for case-insensitive matching: use LOWER(city) = 'bangalore' to convert both sides to lowercase before comparing — this works in all databases but prevents index usage on city. In PostgreSQL, use the ILIKE operator (case-insensitive LIKE) as an alternative: WHERE city ILIKE 'bangalore'. For production systems where case-insensitive search is frequent, create a functional index on LOWER(city) so that WHERE LOWER(city) = 'bangalore' can use the index. The safest long-term practice is to standardise data capitalisation on insert — store all cities as 'Seattle' consistently, then case-sensitive comparison is always correct and fast.

// Part 13

Errors You Will Hit — And Exactly Why They Happen

ERROR: operator does not exist: integer = text — WHERE customer_id = '1'

Cause: You are comparing a numeric column (customer_id is INTEGER) with a string value ('1' in quotes). PostgreSQL is strict about type matching — it will not silently convert '1' to 1 for comparison. This happens when you paste code from a tool that adds quotes around all values regardless of type, or when column types are not what you expect.

Fix: Remove the quotes from numeric values: WHERE customer_id = 1 — no quotes. PostgreSQL will not implicitly cast a string to an integer in a comparison. If you are generating queries dynamically in code, use parameterised queries with the correct Python/JS type (integer, not string) for the value. MySQL is more permissive and will silently cast '1' to 1 — which is why the same code works in MySQL but fails in PostgreSQL.

ERROR: invalid input syntax for type date: '15-01-2024' — WHERE order_date = '15-01-2024'

Cause: The date string is in DD-MM-YYYY format but PostgreSQL (and most databases) expect YYYY-MM-DD (ISO 8601 format). PostgreSQL tries to parse the string as a date and fails because '15' is not a valid year in the YYYY position of the default format. This is an extremely common error when dates come from a spreadsheet, a user form, or a non-technical source.

Fix: Always use YYYY-MM-DD format in SQL queries: WHERE order_date = '2024-01-15'. If your data is stored in a different format, convert it at insertion time. If you must parse a non-standard format at query time, use TO_DATE() in PostgreSQL: WHERE order_date = TO_DATE('15-01-2024', 'DD-MM-YYYY'). In MySQL use STR_TO_DATE(): WHERE order_date = STR_TO_DATE('15-01-2024', '%d-%m-%Y').

Query returns 0 rows — expected to find data

Cause: This is not a database error but a logic error — your WHERE condition does not match any rows even though you expect it to. Most common causes: string value is wrong case (WHERE city = 'bangalore' but data stores 'Seattle'), extra whitespace in the stored value (WHERE email = 'aisha@gmail.com' but stored as ' aisha@gmail.com' with a leading space), using = NULL instead of IS NULL, or the value genuinely does not exist in the table.

Fix: Debug systematically. First, run SELECT * FROM table LIMIT 10 to see actual stored values. Check case and whitespace carefully. If checking for NULL, use IS NULL not = NULL. Use LOWER() on both sides to rule out case issues: WHERE LOWER(city) = LOWER('Seattle'). Use TRIM() to rule out whitespace: WHERE TRIM(email) = 'aisha@gmail.com'. Once you find the exact stored value, update your WHERE condition to match it precisely.

ERROR: column 'total' does not exist — WHERE total > 1000 (when SELECT has AS total)

Cause: You gave a computed column an alias (AS total) in SELECT and then tried to use that alias in WHERE. This fails because WHERE executes before SELECT in SQL's logical processing order — the alias 'total' does not exist yet when WHERE is evaluated. This is one of the most common SQL logic errors for beginners who expect aliases to be available throughout the query.

Fix: Repeat the expression in WHERE instead of using the alias: WHERE total_amount > 1000 (use the original column name), or WHERE (unit_price - cost_price) > 100 (repeat the calculation). Alternatively, wrap your query in a subquery or CTE (covered in Modules 36 and 55) where the inner query defines the alias and the outer query filters on it: SELECT * FROM (SELECT *, unit_price - cost_price AS margin FROM products) t WHERE margin > 100. Note: ORDER BY is the one clause where aliases ARE available, because ORDER BY executes after SELECT.

WARNING: 0 rows affected by UPDATE — WHERE customer_id = 999

Cause: This is not an error but a critical warning to understand. An UPDATE or DELETE with a WHERE clause that matches zero rows executes successfully — the database reports 0 rows affected. This happens when: the ID does not exist, the value has a typo, or the condition is too restrictive. The danger is in the opposite case: a WHERE clause that is less restrictive than intended can affect many more rows than you expected, including wrong rows.

Fix: Before running any UPDATE or DELETE with a WHERE clause, always run the equivalent SELECT with the same WHERE to see which rows will be affected: SELECT * FROM customers WHERE customer_id = 999 — if this returns 0 rows, your UPDATE will also affect 0 rows. Fix the condition before running the destructive operation. As a habit: SELECT first, then UPDATE or DELETE. Many SQL clients have a safe mode that requires you to explicitly confirm bulk operations.

Try It Yourself

The FreshCart operations team needs a report of all high-value delivered orders placed in February 2024 — specifically orders with a total_amount above ₹800 that were successfully delivered. Show the order_id, order_date, delivery_date, payment_method, and total_amount. Sort by total_amount descending.

🎯 Key Takeaways

✓WHERE filters rows by evaluating a condition for each row. Only rows where the condition evaluates to TRUE are included. FALSE and NULL both cause the row to be discarded.
✓WHERE executes before SELECT in the logical execution order: FROM → WHERE → GROUP BY → HAVING → SELECT → ORDER BY → LIMIT. This is why SELECT aliases cannot be used in WHERE.
✓The six comparison operators: = (equal), <> or != (not equal), > (greater than), < (less than), >= (greater than or equal), <= (less than or equal). All work on numbers, text, and dates.
✓Numbers in WHERE have no quotes: WHERE salary > 50000. Text values always use single quotes: WHERE city = 'Seattle'. Dates use single quotes in YYYY-MM-DD format: WHERE order_date >= '2024-01-01'.
✓String comparison is case-sensitive in PostgreSQL and DuckDB. WHERE city = 'bangalore' finds nothing if values are stored as 'Seattle'. Use LOWER() or ILIKE for case-insensitive matching.
✓You cannot use = NULL to check for NULL values — any comparison involving NULL returns NULL, not TRUE. Always use IS NULL and IS NOT NULL instead.
✓WHERE filters rows. HAVING filters groups after GROUP BY. You cannot use aggregate functions (COUNT, SUM, AVG) in WHERE — use HAVING for that.
✓WHERE on an indexed column triggers a fast index lookup. WHERE on an unindexed column triggers a slow full table scan. Avoid applying functions to the column side of WHERE conditions — it prevents index usage.
✓The debug workflow for unexpected zero results: run SELECT * FROM table LIMIT 10 to see actual stored values, check for case and whitespace mismatches, verify you are using IS NULL not = NULL.
✓Before any UPDATE or DELETE, always run the equivalent SELECT with the same WHERE to confirm which rows will be affected. Never run a destructive operation without first seeing the result of the equivalent SELECT.

What comes next

In Module 07, you combine multiple WHERE conditions using AND, OR, and NOT — turning simple single-condition filters into precise multi-condition queries that can answer complex business questions in a single statement.

Module 07 → Multiple Conditions — AND, OR, NOT

Discussion

Have a better approach? Found something outdated? Share it — your knowledge helps everyone learning here.

Continue with GitHub