SQL — Module 02Beginner

How Databases Work

Tables, rows, columns, data types, primary keys, foreign keys, constraints — the complete building blocks

10–14 min April 2026

Section 1 · Databases & Setup

Databases & Setup · 4 modulesModule 2

// Part 01

What Actually Happens When You Run a Query

Before learning the building blocks of a database, it helps to understand what a database is actually doing every time you interact with it. Most tutorials skip this and jump straight to CREATE TABLE. That creates a mental model of a database as a passive container — a box you put data into and take data out of. The reality is far more active, and understanding it will make every concept in this course easier to grasp.

Here is the complete journey of a single query — what happens in the milliseconds between you pressing Enter and the result appearing on your screen.

Parser — Read and validate the SQL

The DBMS first reads your SQL text and checks it for syntax errors. If you wrote SLECT instead of SELECT, it stops here and returns a syntax error. No data is touched. The parser also breaks the query into a parse tree — a structured representation of what you asked for.

Query Planner — Figure out the best way to answer

This is the most intelligent part of the database. The query planner looks at your query and decides how to execute it — which indexes to use, in what order to join tables, whether to scan the whole table or jump to specific pages. It generates multiple execution plans, estimates the cost of each one, and picks the cheapest. A badly written query can trick the planner into choosing a slow plan — which is why understanding the internals makes you a better SQL writer.

Executor — Run the chosen plan

The executor carries out the plan the query planner chose. It reads pages from the buffer pool (memory) or from disk if they are not already cached, applies filters, joins rows from different tables, applies functions, and assembles the result set.

Buffer Pool — Serve from memory when possible

The buffer pool is the database's memory cache. Frequently accessed pages of data live here so they do not need to be read from disk on every query. If the page you need is in the buffer pool, the query is served from RAM — extremely fast. If not, the database fetches it from disk — much slower. This is why your second run of the same query is often faster than the first.

Storage Engine — Read and write to disk

Underneath everything is the storage engine — the component that physically reads and writes data pages on disk. Different databases use different storage engines with different trade-offs. MySQL's default is InnoDB. PostgreSQL uses its own built-in engine. The storage engine also manages the Write-Ahead Log — writing every change to the log before writing to the data pages, ensuring crash safety.

Transaction Manager — Enforce ACID

Every query runs inside a transaction, even if you did not explicitly write BEGIN. The transaction manager ensures isolation — other sessions cannot see your in-progress changes — and atomicity — your changes either fully commit or fully roll back. It uses locking mechanisms to coordinate concurrent sessions safely.

💡 Note

You do not need to memorise all six steps. What matters is the mental model: SQL is not directly touching files on disk. There is a sophisticated engine between your query and the data — parsing, planning, caching, and enforcing rules at every step. When a query is slow, the problem is almost always in step 2 (the planner chose a bad plan) or step 4 (too many disk reads, not enough in the buffer pool).

// Part 02

Tables — The Complete Rules

A table is the fundamental unit of storage in a relational database. Every piece of data you ever store lives in a table. Understanding what makes a good table — and what makes a bad one — is the most important design skill in all of SQL. Bad table design causes problems that no amount of clever querying can fix.

Rule 1: One table, one thing

The single most important rule in database design: each table stores exactly one type of entity. FreshCart has a customers table that stores only customers. A products table that stores only products. An orders table that stores only orders. You would never put customer data and product data in the same table, even though both are used when a customer buys a product.

When beginners design databases for the first time they often try to put everything in one table — one row per order, with the customer's name, address, and loyalty tier repeated in every single row. This is called data redundancy and it causes serious problems: if Aisha Khan moves from Seattle to Austin, you have to update her address in every single order row. Miss one row and your data is inconsistent. In a properly designed database you update her address in exactly one place — the customers table — and every order automatically reflects it through the foreign key relationship.

Rule 2: Every column stores exactly one piece of information

Each column must represent one atomic, indivisible piece of data. A column called address that stores "12 Koramangala, Seattle, 560034" is a bad design — city, street, and zip_code are three different pieces of information crammed into one column. When you later want to find all customers in Seattle, you would have to use string pattern matching on the whole address, which is slow and error-prone. Correct design separates them: street, city, state, zip_code — four columns, four clean pieces of data.

Rule 3: Table and column names must be clear, lowercase, and use underscores

The universal convention for SQL tables and columns: all lowercase letters, words separated by underscores, no spaces, no special characters. customers not Customers. order_date not OrderDate or orderDate. unit_price not UnitPrice or unitprice. This matters because SQL is case-sensitive in some databases for table names, and mixed-case names require quoting everywhere — adding noise to every query you write.

The FreshCart tables examined

Table	What one thing it stores	Why it is separate
customers	One row per person who shops at FreshCart	Customer details (name, city, tier) should only exist in one place — update once, reflected everywhere
orders	One row per purchase transaction	An order has its own attributes (date, status, total) that are not about the customer or the products inside it
order_items	One row per product line within an order	One order can have many products — storing them in the orders table would require an unknown number of columns
products	One row per thing FreshCart sells	Product details (name, price, category) should not be repeated in every order_items row — update once in products
stores	One row per physical FreshCart location	Store details (city, manager, target) are independent of any specific order or customer
employees	One row per person who works at FreshCart	Staff data (salary, role, hire date) is sensitive and operationally separate from customer or product data

// Part 03

Data Types — Choosing the Right Container for Every Column

When you create a table, every column must be given a data type — a declaration of what kind of data that column will hold. This is not just a label. The database uses the data type to: allocate the right amount of storage space, enforce that only valid values can be inserted, and choose the right comparison rules when you filter or sort.

Choosing the wrong data type is a mistake you cannot fix cheaply. Changing a column from VARCHAR(50) to VARCHAR(255) on a table with 500 million rows can take hours and lock the table. Getting it right at design time is worth the extra five minutes of thought.

Type	Storage	Example value	Use in FreshCart
INTEGER	4 bytes	1, 42, 1001	customer_id, product_id, order_id — any whole number ID or count
BIGINT	8 bytes	9876543210	phone numbers, large counters — when INTEGER's max of ~2 billion is not enough
DECIMAL(10,2)	variable	340.00, 28.50	unit_price, total_amount — money. NEVER use FLOAT for money — floating point arithmetic introduces rounding errors
VARCHAR(n)	up to n chars	'Aisha', 'Seattle'	first_name, city, product_name — variable-length text up to n characters
CHAR(n)	exactly n chars	'ST001'	store_id — fixed-length codes where every value is always the same length
TEXT	unlimited	long descriptions	product descriptions, notes — when you cannot predict maximum length. Slower to index than VARCHAR
DATE	3 bytes	2024-01-05	order_date, joined_date, hire_date — date without time, stored as YYYY-MM-DD
TIMESTAMP	8 bytes	2024-01-05 14:32:11	created_at, updated_at — exact moment in time including hours, minutes, seconds
BOOLEAN	1 byte	true, false	in_stock — yes/no flags. MySQL stores as TINYINT(1), PostgreSQL has a native BOOL type

The money mistake — why FLOAT is wrong for prices

This is one of the most common beginner mistakes. If you define a price column as FLOAT or DOUBLE, you will eventually get values like 28.499999999998 instead of 28.5. This happens because floating-point numbers are stored in binary and cannot represent most decimal fractions exactly — the same way 1/3 cannot be written as a finite decimal. For currency you must always use DECIMAL(precision, scale). In FreshCart, DECIMAL(10,2) means up to 10 total digits with exactly 2 decimal places — sufficient for prices up to ₹99,999,999.99.

VARCHAR vs CHAR — when to use which

VARCHAR(n) stores variable-length strings — it only uses as much space as the actual string needs, plus 1–2 bytes to record the length. A VARCHAR(100) column storing the word "Amul" uses 5 bytes, not 100. CHAR(n) stores fixed-length strings — it always uses exactly n bytes, padding shorter values with spaces. Use CHAR for values that are always the same length: country codes (IN, US), store IDs (ST001), status codes. Use VARCHAR for everything else.

⚠️ Important

In MySQL, TEXT columns cannot be indexed directly (only the first n characters can be indexed). If you need to search or sort on a column, use VARCHAR with an appropriate limit, not TEXT. PostgreSQL does not have this limitation, but the convention of using VARCHAR for indexable columns is still good practice across all databases.

Loading FreshCart DB…
-- See the data types in action — FreshCart products
-- Notice: prices are clean decimals, dates are formatted,
-- booleans show as true/false
SELECT product_name,
       unit_price,
       cost_price,
       unit_price - cost_price AS margin,
       in_stock
FROM products
ORDER BY margin DESC
LIMIT 8;Ctrl + Enter to run
Loading FreshCart database in your browser…

// Part 04

Primary Keys — Every Row's Unique Identity

A primary key is a column (or combination of columns) that uniquely identifies every row in a table. It is the most fundamental constraint in all of SQL. Every table you will ever create should have one.

What the database enforces automatically

When you declare a column as the PRIMARY KEY, the database automatically enforces two rules that you can never break: uniqueness — no two rows can have the same primary key value, and NOT NULL — the primary key column can never be empty. You do not need to write separate constraints for these. The PRIMARY KEY declaration implies both.

The database also automatically creates an index on the primary key column. This means lookups by primary key — the most common type of lookup in any application — are always fast, regardless of how large the table gets.

Auto-increment — let the database generate IDs

In almost every table you build, the primary key will be an auto-incrementing integer. This means the database generates the next value automatically whenever you insert a new row. In MySQL this is done with AUTO_INCREMENT. In PostgreSQL it is done with SERIAL or GENERATED ALWAYS AS IDENTITY. You never manually type a primary key value — the database handles it.

MySQL syntax

CREATE TABLE customers (
  customer_id  INTEGER       PRIMARY KEY AUTO_INCREMENT,
  first_name   VARCHAR(100)  NOT NULL,
  last_name    VARCHAR(100)  NOT NULL,
  email        VARCHAR(255)  NOT NULL UNIQUE,
  city         VARCHAR(100),
  loyalty_tier VARCHAR(20)   DEFAULT 'Bronze',
  joined_date  DATE          NOT NULL
);

PostgreSQL syntax

CREATE TABLE customers (
  customer_id  INTEGER       PRIMARY KEY GENERATED ALWAYS AS IDENTITY,
  first_name   VARCHAR(100)  NOT NULL,
  last_name    VARCHAR(100)  NOT NULL,
  email        VARCHAR(255)  NOT NULL UNIQUE,
  city         VARCHAR(100),
  loyalty_tier VARCHAR(20)   DEFAULT 'Bronze',
  joined_date  DATE          NOT NULL
);

Composite primary keys — when one column is not enough

Sometimes a single column is not sufficient to uniquely identify a row, and you need a combination of two columns to serve as the primary key. This is called a composite primary key. The order_items table is a good example: item_id is a dedicated surrogate key here, but you could also uniquely identify each item by the combination of (order_id, product_id) — one order cannot contain the same product twice. Composite primary keys are common in junction tables (many-to-many relationships). In FreshCart we use a surrogate integer key for simplicity, but both approaches are correct.

🎯 Pro Tip

Always use a dedicated surrogate integer key (an auto-increment ID) as the primary key, even when another column or combination of columns could uniquely identify rows. Phone numbers change. Email addresses change. Business rules change. An integer ID never needs to change, which means every foreign key referencing it also never needs to change. This saves enormous pain when requirements evolve.

// Part 05

Foreign Keys — How Tables Connect and Why It Matters

A foreign key is a column in one table that stores the primary key value of a row in another table. It creates an enforced, permanent link between the two tables. Foreign keys are the mechanism that makes a database "relational" in practice — they are what allow you to ask questions that span multiple tables.

What referential integrity means

When you declare a foreign key, the database enforces a rule called referential integrity: every foreign key value must either match an existing primary key value in the referenced table, or be NULL. This has two practical consequences. First, you cannot insert an order with customer_id = 500 if no customer with id 500 exists — the database rejects the insert with an error. Second, you cannot delete a customer who still has orders — the database rejects the delete because it would leave orphaned orders pointing to nobody.

ON DELETE behaviour — what happens when the parent is deleted

You can control what happens to child rows when their parent is deleted by specifying a behaviour on the foreign key constraint:

Option	What happens	When to use
RESTRICT (default)	Prevents deleting the parent if any child rows exist. The delete fails with an error.	Most situations — you want to know before deleting a customer that they have 8 orders.
CASCADE	Automatically deletes all child rows when the parent is deleted.	Dependent data that has no meaning without the parent — e.g. deleting a user account deletes their session tokens.
SET NULL	Sets the foreign key column to NULL in all child rows.	When the relationship becomes optional — e.g. an employee's manager is deleted but the employee stays.
SET DEFAULT	Sets the foreign key column to its default value in all child rows.	Rare. When child rows should fall back to a default parent.

The FreshCart foreign key map in code

Every foreign key in the FreshCart schema

-- orders → customers
FOREIGN KEY (customer_id) REFERENCES customers(customer_id)

-- orders → stores
FOREIGN KEY (store_id) REFERENCES stores(store_id)

-- order_items → orders
FOREIGN KEY (order_id) REFERENCES orders(order_id)

-- order_items → products
FOREIGN KEY (product_id) REFERENCES products(product_id)

-- employees → stores
FOREIGN KEY (store_id) REFERENCES stores(store_id)

-- employees → employees (self-referencing: manager is also an employee)
FOREIGN KEY (manager_id) REFERENCES employees(employee_id)

Notice the last one: employees.manager_id references employees.employee_id — a table referencing itself. This is called a self-referencing foreign key and it is how org charts and hierarchy data are stored. Priya Sharma (employee_id = 1) is the Store Manager with no manager above her (manager_id = NULL). Rahul Verma (employee_id = 2) reports to Priya, so his manager_id = 1. You will learn to query this hierarchy using SELF JOINs in Module 34.

// Part 06

Constraints — Rules the Database Enforces Automatically

Constraints are rules you attach to columns (or tables) that the database checks on every INSERT and UPDATE. If a value violates a constraint, the operation fails with an error — the data is never saved. Constraints are your first line of defence against bad data. They are far more reliable than checking validity in application code, because application code can have bugs, can be bypassed, and does not cover every path data enters through.

NOT NULL — the column must always have a value

By default, any column can contain NULL — the absence of a value. Adding NOT NULL means the column must always have a real value. Inserting a row without providing a value for a NOT NULL column causes an error. In FreshCart, first_name, last_name, and order_date are NOT NULL — it makes no sense to have a customer without a name or an order without a date.

UNIQUE — no two rows can have the same value

The UNIQUE constraint prevents duplicate values in a column across all rows. In FreshCart, the email column on customers is UNIQUE — no two customers can share the same email address. Unlike PRIMARY KEY, a UNIQUE column can contain NULL (and multiple NULLs are allowed, since NULL is not equal to NULL in SQL — more on this surprising behaviour in Module 11).

DEFAULT — the value used when nothing is provided

The DEFAULT constraint specifies what value the database should use when a row is inserted without providing a value for that column. In FreshCart, loyalty_tier has a DEFAULT of 'Bronze' — when a new customer is added without specifying their tier, they automatically start at Bronze. Defaults keep INSERT statements cleaner and reduce the chance of NULL slipping into columns that should always have a value.

CHECK — custom rule for valid values

The CHECK constraint lets you define an arbitrary condition that every value in a column must satisfy. For example, discount_pct in order_items should always be between 0 and 100. A CHECK constraint enforces this: CHECK (discount_pct >= 0 AND discount_pct <= 100). If you try to insert a discount of 150 or -5, the database rejects it immediately.

All constraints on the FreshCart orders table

CREATE TABLE orders (
  order_id       INTEGER        PRIMARY KEY AUTO_INCREMENT,
  customer_id    INTEGER        NOT NULL,
  store_id       VARCHAR(10)    NOT NULL,
  order_date     DATE           NOT NULL,
  delivery_date  DATE,                          -- nullable: NULL until delivered
  order_status   VARCHAR(20)    NOT NULL
                 DEFAULT 'Processing'
                 CHECK (order_status IN
                   ('Delivered','Processing','Cancelled','Returned')),
  payment_method VARCHAR(20)    NOT NULL
                 CHECK (payment_method IN
                   ('Credit Card','Debit Card','Apple Pay','Cash')),
  total_amount   DECIMAL(10,2)  NOT NULL
                 CHECK (total_amount >= 0),

  FOREIGN KEY (customer_id) REFERENCES customers(customer_id),
  FOREIGN KEY (store_id)    REFERENCES stores(store_id)
);

🎯 Pro Tip

Constraints are your data quality layer at the database level. They work regardless of which application, script, or person inserts data. A bug in your app might skip a validation check. A direct INSERT from the terminal bypasses all app-level validation. Constraints at the database level catch everything.

// Part 07

The FreshCart Schema — Every Table Examined Line by Line

Now that you understand tables, data types, primary keys, foreign keys, and constraints — let us look at the full FreshCart schema in one place. You will use this schema for every query in every module. Read it carefully and notice how every design decision reflects the rules covered in this module.

Loading FreshCart DB…
-- Explore the customers table structure
-- Every column, its type, and what it holds
SELECT customer_id, first_name, last_name,
       email, city, loyalty_tier, joined_date
FROM customers
ORDER BY joined_date
LIMIT 10;Ctrl + Enter to run
Loading FreshCart database in your browser…

Loading FreshCart DB…
-- See how orders connect to customers and stores
-- The customer_id and store_id columns are foreign keys
SELECT o.order_id,
       c.first_name || ' ' || c.last_name AS customer_name,
       o.store_id,
       o.order_date,
       o.order_status,
       o.payment_method,
       o.total_amount
FROM orders o
JOIN customers c ON o.customer_id = c.customer_id
ORDER BY o.order_date
LIMIT 10;Ctrl + Enter to run
Loading FreshCart database in your browser…

Notice how the second query uses JOIN to combine orders with customer names. The order table only stores customer_id — a number. To get the actual name, you join to the customers table using the foreign key relationship. This is the relational model working as designed. You will learn exactly how to write JOINs — including multiple-table JOINs — in Modules 30 through 35.

// Part 08

What This Looks Like at Work

You are hired as a junior backend developer at a Seattle fintech startup. On your second week, you are asked to review a new table that a colleague designed and give feedback before it goes to production. Here is what that day looks like.

10:00 AM

Schema review request arrives

Your team lead shares a PR in GitHub. It adds a new table called user_transactions to the database. She asks you to review the schema before it is merged. You open the file.

10:15 AM

You spot the first problem

The table stores user_name, user_email, and user_phone in the transactions table — repeated for every transaction. You immediately recognise this as a violation of the "one table, one thing" rule. If a user updates their phone number, thousands of transaction rows would need to be updated. You add a comment: "User details should live in a users table. transactions should only store user_id as a foreign key."

10:30 AM

You spot the second problem

The amount column is defined as FLOAT. You have been warned about this in Module 03 of your SQL course. Floating point arithmetic will introduce rounding errors on financial data. You add a comment: "Change to DECIMAL(12,2). FLOAT is wrong for money — you will eventually get 199.99999999998 instead of 200."

10:45 AM

Third problem — missing constraints

The status column has no CHECK constraint. Any string value can be inserted — including typos like "COMPLEATED" or "canceld". You suggest adding: CHECK (status IN ('pending', 'completed', 'failed', 'refunded')). You also notice transaction_date has no NOT NULL — a transaction must always have a date.

11:00 AM

You post the review

Three comments, all specific, all backed by reasons. Your team lead replies: "Great catches, fixing now." She tells your manager during standup that you flagged issues that would have caused real problems in three months when the user base grew. You have been at the company for two weeks.

🎯 Pro Tip

Knowing database design principles — not just SQL syntax — is what separates developers who write queries from developers who design systems. A PR review comment that catches a FLOAT money column or a missing NOT NULL constraint demonstrates understanding that takes most engineers years to develop. It comes from understanding the why, not just the how.

// Part 09

Interview Prep — 5 Questions With Complete Answers

Q: What is the difference between PRIMARY KEY and UNIQUE constraint?

Both PRIMARY KEY and UNIQUE prevent duplicate values in a column. The differences are three. First, a table can have only one PRIMARY KEY but multiple UNIQUE constraints — you could have unique constraints on both email and phone independently. Second, PRIMARY KEY columns implicitly have NOT NULL — they can never contain a null value. UNIQUE columns can contain NULL, and in most databases multiple NULLs are allowed in a UNIQUE column because NULL is not considered equal to NULL. Third, the PRIMARY KEY is the canonical identifier for the row and is the column other tables reference in their foreign keys. UNIQUE constraints enforce business rules about uniqueness but do not serve as the relational identifier.

In FreshCart's customers table: customer_id is the PRIMARY KEY — it is the row's identity that orders reference. email has a UNIQUE constraint — we want no two customers sharing the same email, but email is not the identifier we use in foreign keys.

Q: What is the difference between DELETE, TRUNCATE, and DROP?

All three remove data but at very different levels. DELETE removes specific rows from a table based on a WHERE condition. It is a transactional operation — it can be rolled back if inside a transaction. It fires triggers. It is slow on large tables because it logs each deleted row. DELETE without a WHERE clause removes all rows but is still transactional and logged.

TRUNCATE removes all rows from a table much faster than DELETE because it does not log individual row deletions — it deallocates the data pages directly. It cannot be filtered with WHERE. In most databases it cannot be rolled back (though PostgreSQL is an exception — TRUNCATE is transactional there). It resets auto-increment counters. It does not fire row-level triggers.

DROP removes the table itself — not just the data, but the entire structure, all its constraints, indexes, and any foreign keys referencing it. The table no longer exists after DROP. You need to recreate it from scratch to use it again. DROP is irreversible and should never be run in production without a backup strategy.

Q: What is referential integrity and how does a database enforce it?

Referential integrity is the guarantee that every foreign key value in a table corresponds to an existing primary key value in the referenced table — no row can reference a parent that does not exist. The database enforces this automatically when you declare a FOREIGN KEY constraint.

Enforcement happens at two moments. On INSERT or UPDATE: if you try to insert a row with a foreign key value that does not exist in the parent table, the database rejects the operation with an error. On DELETE or UPDATE of the parent: if you try to delete or change the primary key of a parent row that has child rows referencing it, the database either rejects the operation (RESTRICT) or handles it according to the ON DELETE behaviour you specified (CASCADE, SET NULL, SET DEFAULT).

Without referential integrity, orphaned records accumulate silently — orders pointing to deleted customers, items pointing to discontinued products. When you join these tables, the orphaned rows silently disappear from results with no error, causing data loss that is difficult to diagnose. Referential integrity prevents this class of problem entirely at the engine level.

Q: Why should you never use FLOAT for storing money values in a database?

FLOAT and DOUBLE are floating-point types that store numbers in binary representation. Most decimal fractions cannot be represented exactly in binary — the same way 1/3 cannot be written as a finite decimal. This means a value you insert as 199.99 might be stored and retrieved as 199.98999999999999. For arbitrary calculations this imprecision is acceptable. For financial data, it is catastrophic.

The consequences compound. If you sum 10,000 transactions each worth ₹199.99, the result might be ₹1,999,899.98 instead of ₹1,999,900. The discrepancy is small on each row but accumulates across millions of transactions. Accounting reconciliation fails. Regulatory audits find inconsistencies. Customers are charged or refunded wrong amounts.

The correct type for money is DECIMAL(precision, scale) — also called NUMERIC. DECIMAL stores values as exact decimal digits, with no binary conversion. DECIMAL(10,2) stores values from -99,999,999.99 to 99,999,999.99 with perfect precision. This is the universal standard for financial data across every production database. MySQL, PostgreSQL, Oracle, and SQL Server all support it identically.

Q: What is a self-referencing foreign key? Give a real example.

A self-referencing foreign key is a foreign key in a table that references the primary key of the same table. It is used to represent hierarchical relationships where entities of the same type relate to each other — most commonly parent-child or manager-employee relationships.

In FreshCart's employees table, each employee has an employee_id (primary key) and a manager_id (foreign key). The manager_id column references employees.employee_id — a manager is also an employee. Priya Sharma is the Store Manager with employee_id = 1 and manager_id = NULL (she reports to nobody in this dataset). Rahul Verma is her Assistant Manager with employee_id = 2 and manager_id = 1 — he reports to Priya. This single foreign key declaration captures the entire org chart with no additional tables.

Self-referencing foreign keys appear in: organisational hierarchies (employees and managers), geographic hierarchies (countries containing states containing cities), category trees (a category that has a parent_category_id pointing to another category), and comment threads (a comment with a parent_comment_id pointing to the comment it replies to). You query this structure using a SELF JOIN or, for arbitrary depth, a recursive CTE — covered in Module 34 and Module 56 respectively.

// Part 10

Errors You Will Hit — And Exactly Why They Happen

ERROR 1215 (HY000): Cannot add foreign key constraint

Cause: The foreign key column and the referenced column do not have exactly matching data types. If customer_id in orders is defined as INT but customer_id in customers is defined as INTEGER UNSIGNED, MySQL treats these as different types and refuses to create the foreign key. Column lengths must also match — a VARCHAR(10) cannot reference a VARCHAR(20). Additionally, the referenced column must be a PRIMARY KEY or have a UNIQUE constraint — you cannot create a foreign key pointing to a regular non-unique column.

Fix: Check that both columns have identical data types including signedness and length. Run SHOW CREATE TABLE customers; and SHOW CREATE TABLE orders; and compare the column definitions side by side. The referenced column must be a primary key or have a unique index. Also ensure the storage engine is InnoDB — MyISAM does not support foreign keys. In PostgreSQL the error is more descriptive: 'there is no unique constraint matching given keys for referenced table.'

ERROR 1062 (23000): Duplicate entry '1' for key 'PRIMARY'

Cause: You are trying to insert a row with a primary key value that already exists. This happens most often when: you are manually specifying primary key values in INSERT statements and accidentally repeat one, you are importing data that already has IDs and the auto-increment counter is not in sync with the existing data, or you are restoring a backup into a table that already has rows.

Fix: If you are specifying IDs manually: check which IDs already exist with SELECT MAX(customer_id) FROM customers; and use a higher value. If using AUTO_INCREMENT: do not specify the primary key column in your INSERT — let MySQL generate it. If you are importing data with existing IDs and the table is empty: use TRUNCATE TABLE first to reset the auto-increment counter, then import. If the table has data: run ALTER TABLE customers AUTO_INCREMENT = (SELECT MAX(customer_id) + 1 FROM customers); to sync the counter.

ERROR 3819 (HY000): Check constraint 'orders_chk_1' is violated

Cause: You are inserting or updating a value that violates a CHECK constraint on the table. For example, if order_status has a CHECK constraint allowing only 'Delivered', 'Processing', 'Cancelled', 'Returned' — and you try to insert 'Pending' (which is not in the list) — the database rejects it. This commonly happens when: you use a value that is valid in your application but was not included in the CHECK list, you have a typo in the value, or the CHECK constraint was defined more restrictively than the application requires.

Fix: Check the exact values the constraint allows: SHOW CREATE TABLE orders; in MySQL or \d orders in PostgreSQL. The constraint definition will be visible. Either change your insert value to match an allowed value, or alter the constraint to include the new valid value: ALTER TABLE orders DROP CONSTRAINT orders_chk_1; then re-add with the updated list. In MySQL 8.0+, CHECK constraints are enforced — in older MySQL versions they were parsed but silently ignored, so you might see this error when upgrading.

ERROR 1292 (22007): Incorrect date value: '2024-31-01' for column 'order_date'

Cause: The date value is in the wrong format or contains an invalid date. MySQL expects dates as 'YYYY-MM-DD'. The value '2024-31-01' is DD-MM-YYYY format — but MySQL interpreted the 31 as the month, which does not exist. Similarly '2024-02-30' fails because February never has 30 days. Dates imported from spreadsheets frequently have this problem because Excel stores dates in local format.

Fix: Always use ISO 8601 format: 'YYYY-MM-DD'. To convert from other formats during import, use STR_TO_DATE() in MySQL: STR_TO_DATE('31/01/2024', '%d/%m/%Y') returns '2024-01-31'. In PostgreSQL use TO_DATE('31/01/2024', 'DD/MM/YYYY'). When accepting date input from users in applications, always validate and convert to YYYY-MM-DD before passing to SQL — never trust that a user-entered date is in the format the database expects.

Column 'customer_id' cannot be null (errno: 1048)

Cause: You are inserting a row without providing a value for a NOT NULL column and no DEFAULT is defined. This commonly happens when: you build a dynamic INSERT statement and the variable for that column is undefined or empty, you are inserting with column names listed but omit a required column, or you are copying data between tables and the source column is NULL while the destination has NOT NULL.

Fix: Ensure every NOT NULL column is included in your INSERT with a valid non-null value. If you are building INSERT statements dynamically in code, add a validation step that checks all required fields are populated before executing. If you cannot always guarantee a value, add a DEFAULT to the column definition: ALTER TABLE orders MODIFY customer_id INTEGER NOT NULL DEFAULT 0; — though a DEFAULT of 0 for a foreign key is only appropriate if customer 0 exists as a sentinel value. Better to fix the application logic to always provide the required value.

Try It Yourself

Look at the FreshCart employees table. Employee ID 1 (Priya Sharma) has manager_id = NULL. Employee ID 2 (Rahul Verma) has manager_id = 1. Employee ID 3 (Sunita Kapoor) has manager_id = 2. Write a query that shows each employee's name alongside their manager's name. What happens to Priya's row since she has no manager?

🎯 Key Takeaways

✓When you run a query, six components work in sequence: Parser (validates syntax), Query Planner (chooses the best execution plan), Executor (runs the plan), Buffer Pool (serves from memory), Storage Engine (reads disk), Transaction Manager (enforces ACID).
✓The single most important table design rule: one table stores exactly one type of entity. Mixing customer and order data in one table causes redundancy that leads to inconsistency when data changes.
✓Every column must store one atomic piece of data. An address column storing "12 Koramangala, Seattle, 560034" is bad design — split into street, city, state, zip_code.
✓Use DECIMAL(precision, scale) for all money columns. FLOAT and DOUBLE introduce binary rounding errors that accumulate into accounting discrepancies on financial data.
✓PRIMARY KEY automatically enforces uniqueness and NOT NULL, and creates a B-tree index. Every table needs one. Use auto-increment integers in almost all cases.
✓UNIQUE allows NULL values (multiple NULLs permitted). PRIMARY KEY does not allow NULL. A table can have one PRIMARY KEY and multiple UNIQUE constraints.
✓Foreign keys enforce referential integrity — you cannot insert a child row referencing a non-existent parent, and you cannot delete a parent with existing children (by default). Use ON DELETE CASCADE only when child rows have no meaning without their parent.
✓Constraints (NOT NULL, UNIQUE, DEFAULT, CHECK) enforce data quality at the engine level — they work regardless of which application, script, or person inserts data. Application-level validation can be bypassed; database constraints cannot.
✓A self-referencing foreign key (manager_id referencing employee_id in the same table) is how hierarchies — org charts, category trees, comment threads — are stored in relational databases.
✓DROP removes the entire table structure. TRUNCATE removes all rows fast but cannot be filtered by WHERE. DELETE removes specific rows, is transactional, and can be rolled back.

What comes next

In Module 03, you learn about the three types of databases — Relational, NoSQL, and NewSQL — what problems each one was built to solve, and how to decide which one is right for any situation. This is the context that makes every tool decision in your career make sense.

Module 03 → Types of Databases

Discussion

Have a better approach? Found something outdated? Share it — your knowledge helps everyone learning here.

Continue with GitHub