Entity-Relationship (ER) Model
The architectural blueprint of database design — how to model any real-world system as entities, attributes, and relationships before writing a single line of SQL.
Why the ER Model Exists — The Gap It Was Built to Fill
By 1976, the relational model had been defined by Codd (1970) and SQL was being developed at IBM. The theory of how to store data was solid. But there was a painful gap: how do you design the schema in the first place? Given a real-world problem — build a database for a university, a hospital, a bank — how do you systematically identify what tables to create, what columns they should have, and how they should connect?
There was no methodology. Developers would stare at requirements documents and try to intuit table structures. Different engineers on the same team would design incompatible schemas for the same problem. Stakeholders — who understood the business domain but not relational algebra — could not validate whether a schema correctly captured the real-world rules they cared about.
In 1976, Peter Pin-Shan Chen published "The Entity-Relationship Model — Toward a Unified View of Data" in ACM Transactions on Database Systems. His contribution was not a new storage model — it was a design methodology with a visual notation. The ER model gave database designers a systematic way to:
Identify every significant thing (entity), every important property (attribute), and every meaningful connection (relationship) in the domain being modelled.
ER diagrams are visual — a domain expert who has never seen SQL can look at an ER diagram and confirm whether it correctly captures the rules of their domain.
Mistakes discovered on paper cost nothing. Mistakes discovered after 50,000 lines of application code have been written against a wrong schema cost weeks of re-engineering.
The ER model has a defined, algorithmic mapping to relational tables. Once you have a correct ER diagram, the table design largely derives mechanically.
The ER model is a conceptual model — it operates at the level of the problem domain, not the level of database implementation. It describes the world in terms of things and their relationships. It has nothing to say about tables, columns, indexes, or SQL. Those concerns come later — after the conceptual model is complete and validated. This separation of concerns is what makes the ER model so powerful: you solve the domain problem first, then solve the implementation problem.
The Miniworld — What You Are Actually Modelling
When designing a database, you are not modelling the entire world — you are modelling a specific, bounded portion of reality called the miniworld (or universe of discourse). The miniworld is defined by the purposes the database must serve and the questions it must be able to answer.
Consider designing a database for a hospital. The real world contains patients, doctors, nurses, administrators, suppliers, insurance companies, medications, equipment, buildings, parking lots, and thousands of other entities. But the hospital's database miniworld might be narrowly defined as: patients, their medical records, doctors who treat them, the wards they are admitted to, and the medications they are prescribed. Everything outside this boundary is excluded from the model — not because it doesn't exist, but because the database doesn't need to know about it.
Defining the miniworld precisely is the first and most critical step in database design. Scope creep in the miniworld definition is the most common cause of over-engineered, bloated schemas that are expensive to build and maintain.
Entities — Modelling the Things That Exist
An entity is any object or thing in the real world that has an independent existence and about which the database needs to store information. Entities are the nouns of your data model. Every table in your final relational schema will correspond to either an entity or a relationship in your ER diagram.
The critical word is independent. An entity can exist on its own. A customer exists regardless of whether they have placed an order. A professor exists regardless of whether they are currently teaching a course. A product exists regardless of whether anyone has bought it. This independent existence is what distinguishes an entity from an attribute.
Entity Type vs Entity Instance — The Distinction That Matters
The abstract category — the definition or template. It describes the kind of thing, not any specific instance. In ER diagrams, entity types are drawn as rectangles. In the final relational schema, an entity type becomes a table.
A specific, concrete occurrence of an entity type. A particular customer, a particular product, a particular order. In the final relational schema, an entity instance becomes a row in a table.
Strong Entities — The Independent Building Blocks
A strong entity (also called a regular entity) is one that has its own unique identifier — an attribute or combination of attributes whose value is unique across all instances of that entity type and is never null. A strong entity can be uniquely identified without reference to any other entity. In ER diagrams, strong entities are drawn with a single-line rectangle.
Each customer has a unique ID assigned at registration. No two customers share an ID. The ID is never null.
Each product has a unique product code. Products exist independently of any orders.
Each employee has a unique employee number assigned at hiring. Exists independently of assignments.
A department exists independently of whether it currently has any employees.
Weak Entities — The Dependents
A weak entity is one that cannot be uniquely identified by its own attributes alone — it depends on a related strong entity (called its owner entity or identifying entity) for its identity. A weak entity's existence is contingent on the existence of its owner — if the owner is deleted, the weak entity must also be deleted. In ER diagrams, weak entities are drawn with a double-line rectangle.
The attribute that partially identifies a weak entity (within the context of its owner) is called the partial keyor discriminator. A partial key is unique only within the set of weak entity instances associated with one specific owner instance — not globally unique across the entire entity set.
An order item (line item) exists only within an order. Line number 1 in Order ORD-001 is completely different from Line number 1 in Order ORD-002. The line_number is not globally unique — it is only meaningful within a specific order. If Order ORD-001 is cancelled and deleted, all its line items must also be deleted.
Employee dependents (family members enrolled in company benefits) exist only in the context of the employee. An employee's dependent named "Priya" is identified by the combination of employee ID + dependent name. If the employee leaves the company and is deleted, their dependents are also removed.
In many banking systems, account numbers are only unique within a branch — the same account number can exist in two different branches. The account is identified by the combination of branch_code + account_number. This is a classic textbook example because it reveals how institutional identifiers are often not globally unique.
Attributes — Every Type, Every Implication
An attribute is a property or characteristic that describes an entity type or a relationship type. Attributes represent the data we want to store about entities. In the final relational schema, most attributes become columns. But not all attributes translate to columns in the same way — different attribute types have different implementation implications.
Understanding attribute types is not just academic — each type has a specific consequence for how the database is designed and how queries are written. Getting this wrong leads to either over-complicated designs or designs that cannot answer required questions.
Relationships — How Entities Connect to Each Other
A relationship type is a meaningful association between two or more entity types. Relationships are the most powerful part of the ER model — they capture the semantic connections that give data its meaning and context. A customer without their orders is just a name. The relationship "PLACES" between CUSTOMER and ORDER is what makes the data a business.
Like entity types, relationship types have relationship instances — specific associations between specific entity instances. "Rahul Sharma placed Order ORD-4521" is a relationship instance. "Prof. Kumar teaches CS301" is a relationship instance. In ER diagrams, relationship types are drawn as diamonds.
Degree of a Relationship — How Many Entity Types Participate
A relationship between instances of the SAME entity type. One entity instance relates to another instance of the same type.
A relationship between instances of TWO different entity types. The vast majority of real-world relationships are binary.
A relationship involving THREE different entity types simultaneously. A ternary relationship captures a fact that cannot be correctly decomposed into binary relationships without losing information.
Relationship Attributes — Properties of the Connection Itself
Relationships can have their own attributes — properties that belong to the relationship instance, not to either participating entity. This is a subtle but important concept. The date an employee was assigned to a project is not a property of the employee (an employee has many assignments with different dates), nor is it a property of the project (a project has many employee assignments with different dates). It is a property of the specific EMPLOYEE-PROJECT assignment relationship instance.
-- EMPLOYEE "works on" PROJECT relationship
-- Relationship attributes: start_date, hours_per_week, role
-- WRONG: Storing relationship attribute in EMPLOYEE table
employees (employee_id, name, project_id, project_start_date)
-- Fails when an employee works on multiple projects (which is common)
-- WRONG: Storing relationship attribute in PROJECT table
projects (project_id, name, employee_id, employee_start_date)
-- Fails when a project has multiple employees (which is always)
-- CORRECT: Junction table captures the M:N relationship AND its attributes
CREATE TABLE employee_project_assignments (
employee_id INT NOT NULL,
project_id INT NOT NULL,
-- Relationship attributes:
start_date DATE NOT NULL,
end_date DATE, -- NULL if still active
hours_per_week DECIMAL(4,1) DEFAULT 40.0,
role VARCHAR(50), -- lead, developer, reviewer, consultant
PRIMARY KEY (employee_id, project_id), -- or add start_date if re-assignments allowed
FOREIGN KEY (employee_id) REFERENCES employees(employee_id),
FOREIGN KEY (project_id) REFERENCES projects(project_id)
);
-- Now: one employee on many projects, one project with many employees
-- AND the relationship-specific data (start_date, role) lives in the right place
-- STUDENT "enrolled in" COURSE with relationship attribute: enrollment_date, grade
CREATE TABLE enrollments (
student_id INT NOT NULL,
course_id INT NOT NULL,
enrollment_date DATE NOT NULL,
grade CHAR(2), -- A, B+, B, C, F — NULL until assigned
attendance_pct DECIMAL(5,2),
PRIMARY KEY (student_id, course_id),
FOREIGN KEY (student_id) REFERENCES students(student_id),
FOREIGN KEY (course_id) REFERENCES courses(course_id)
);Cardinality Constraints — The Most Critical Design Decision
Cardinality specifies the maximum number of relationship instances that an entity instance can participate in. It answers the question: for one instance of entity A, how many instances of entity B can it be associated with — and vice versa?
Cardinality is arguably the most important single decision in ER design because it directly determines the table structure in the relational schema — specifically, whether to use a foreign key column or a separate junction table. Getting cardinality wrong creates schemas that cannot represent real-world data or that represent it with severe redundancy.
Participation Constraints — Total vs Partial
Cardinality tells you the maximum number of relationship instances. Participation constraints tell you the minimum — whether participation in a relationship is mandatory or optional.
Every instance of the entity MUST participate in at least one relationship instance. No entity instance is allowed to exist without being associated with the other entity.
Represented by a double line in ER diagrams. Implemented as a NOT NULL constraint or a mandatory foreign key in the relational schema.
An order cannot exist without a customer.
→ customer_id NOT NULL in orders table
Some instances of the entity may NOT participate in any relationship instance. The relationship is optional — an entity can exist without being associated with the other entity.
Represented by a single line in ER diagrams. Implemented by allowing NULL in the foreign key column.
A customer can exist without any orders.
→ customers without orders is valid
-- Relationship: EMPLOYEE "works in" DEPARTMENT
-- Total participation on EMPLOYEE side: every employee must be in a department
-- Partial participation on DEPARTMENT side: a department can exist with no employees (new dept)
CREATE TABLE departments (
dept_id INT PRIMARY KEY,
dept_name VARCHAR(100) UNIQUE NOT NULL,
budget DECIMAL(15,2)
);
CREATE TABLE employees (
employee_id INT PRIMARY KEY,
name VARCHAR(100) NOT NULL,
dept_id INT NOT NULL, -- NOT NULL = total participation!
-- Every employee MUST be in exactly one department
FOREIGN KEY (dept_id) REFERENCES departments(dept_id)
-- ON DELETE RESTRICT: cannot delete a department that has employees
-- This enforces the total participation from the other direction too
);
-- Relationship: CUSTOMER "has" LOYALTY_PROFILE (optional — partial participation)
CREATE TABLE loyalty_profiles (
profile_id INT PRIMARY KEY,
customer_id INT UNIQUE, -- UNIQUE but NOT NOT NULL → profile is optional
points INT DEFAULT 0,
FOREIGN KEY (customer_id) REFERENCES customers(customer_id)
);
-- customer_id is UNIQUE (enforces 1:1 max) but not NOT NULL (profile is optional)Extended ER Model — Generalisation, Specialisation, and Aggregation
The basic ER model (entities, attributes, relationships, cardinality) is sufficient for modelling many real-world systems. But complex domains — particularly those with entity hierarchies, inheritance, or relationships involving other relationships — require extended ER (EER) concepts. These concepts were developed in the 1980s to bring object-oriented ideas (inheritance, abstraction) into the entity-relationship framework.
Generalisation — Bottom-Up Abstraction
Generalisation is a bottom-up design process: you start with specific entity types and identify common attributes among them, extracting those common attributes into a higher-level, more general entity type. The specific entities become specialised subtypes of the general entity.
This is conceptually identical to inheritance in object-oriented programming — you are extracting a superclass (the generalised entity) from existing classes (the specific entities). The design process moves from specific to general.
// Starting point: two separate entity types that share many attributes
EMPLOYEE entity: employee_id, name, salary, hire_date, office_location
CONTRACTOR entity: contractor_id, name, hourly_rate, contract_end_date, agency
// Observation: both have name, both represent people working for the company
// Generalisation: extract common attributes into a WORKER supertype
WORKER (supertype): worker_id, name, contact_email
|
├── EMPLOYEE (subtype): salary, hire_date, office_location, benefit_plan
└── CONTRACTOR (subtype): hourly_rate, contract_end_date, agency_name
// The generalisation captures:
// - What all workers have in common (WORKER attributes)
// - What distinguishes employees from contractors (subtype-specific attributes)
// - The IS-A relationship: an EMPLOYEE IS-A WORKER, a CONTRACTOR IS-A WORKERSpecialisation — Top-Down Decomposition
Specialisation is the reverse: a top-down design process. You start with a general entity type and define distinct subtypes based on differences in attributes or relationships. Different subtypes may have different attributes that are not applicable to the general type or to other subtypes.
A vehicle might be specialised into car and truck — a car has passenger_capacity while a truck has payload_tonnes. These are not attributes of all vehicles, only of specific subtypes.
// Starting point: a general ACCOUNT entity
ACCOUNT (supertype): account_number, balance, owner_id, opened_date
// Specialisation based on account type:
ACCOUNT
|
├── SAVINGS_ACCOUNT: interest_rate, minimum_balance, withdrawal_limit_per_month
├── CURRENT_ACCOUNT: overdraft_limit, transaction_charges
└── FIXED_DEPOSIT: deposit_term_months, maturity_date, lock_in_amount
// VEHICLE → CAR, TRUCK, MOTORCYCLE:
VEHICLE: vehicle_id, registration_no, manufacturer, model_year, colour
|
├── CAR: passenger_capacity, fuel_type, transmission, airbag_count
├── TRUCK: payload_tonnes, num_axles, cargo_type
└── MOTORCYCLE: engine_cc, sidecar_attached
// Specialisation constraints:
// DISJOINT: an entity can belong to only ONE subtype (a car cannot also be a truck)
// OVERLAPPING: an entity can belong to MULTIPLE subtypes simultaneously
// (a PERSON can be both an EMPLOYEE and a STUDENT simultaneously)
// TOTAL: every supertype instance must belong to at least one subtype
// (every ACCOUNT must be a savings, current, or fixed deposit)
// PARTIAL: some supertype instances may not belong to any subtype
// (some VEHICLES may not be specialised yet — their type is unknown)Implementing Generalisation/Specialisation in Relational Schema
The ER model has a clean notation for inheritance. The relational model does not natively support inheritance — so you must choose an implementation strategy. There are three options, each with trade-offs.
Store all subtypes in one table, with a discriminator column indicating the subtype, and nullable columns for subtype-specific attributes.
CREATE TABLE workers (
worker_id INT PRIMARY KEY,
name VARCHAR(100) NOT NULL,
worker_type VARCHAR(20) NOT NULL, -- 'employee' or 'contractor'
-- Employee-specific (NULL for contractors):
salary DECIMAL(10,2),
hire_date DATE,
benefit_plan VARCHAR(50),
-- Contractor-specific (NULL for employees):
hourly_rate DECIMAL(8,2),
contract_end_date DATE,
agency_name VARCHAR(100),
CHECK (
(worker_type = 'employee' AND salary IS NOT NULL AND hourly_rate IS NULL)
OR
(worker_type = 'contractor' AND hourly_rate IS NOT NULL AND salary IS NULL)
)
);Create one table for the supertype (common attributes) and one separate table for each subtype (subtype-specific attributes only). Subtypes share the same PK as the supertype.
-- Supertype table: common attributes
CREATE TABLE workers (
worker_id INT PRIMARY KEY,
name VARCHAR(100) NOT NULL,
email VARCHAR(150) UNIQUE NOT NULL,
worker_type VARCHAR(20) NOT NULL -- for fast subtype identification
);
-- Employee subtype table: employee-specific attributes only
CREATE TABLE employees (
worker_id INT PRIMARY KEY, -- same PK as workers
salary DECIMAL(10,2) NOT NULL,
hire_date DATE NOT NULL,
benefit_plan VARCHAR(50),
FOREIGN KEY (worker_id) REFERENCES workers(worker_id) ON DELETE CASCADE
);
-- Contractor subtype table
CREATE TABLE contractors (
worker_id INT PRIMARY KEY,
hourly_rate DECIMAL(8,2) NOT NULL,
contract_end_date DATE NOT NULL,
agency_name VARCHAR(100),
FOREIGN KEY (worker_id) REFERENCES workers(worker_id) ON DELETE CASCADE
);
-- Get full employee record:
SELECT w.name, w.email, e.salary, e.hire_date
FROM workers w JOIN employees e ON w.worker_id = e.worker_id
WHERE w.worker_id = 1001;
-- Get all workers (polymorphic):
SELECT w.name, 'Employee' as type, e.salary::TEXT as key_field
FROM workers w JOIN employees e ON w.worker_id = e.worker_id
UNION ALL
SELECT w.name, 'Contractor', c.hourly_rate::TEXT
FROM workers w JOIN contractors c ON w.worker_id = c.worker_id;Create a separate, fully independent table for each subtype — repeating the supertype attributes in each subtype table. No supertype table exists.
-- No workers table — each subtype is fully self-contained
CREATE TABLE employees (
employee_id INT PRIMARY KEY,
name VARCHAR(100) NOT NULL, -- repeated from supertype
email VARCHAR(150) UNIQUE NOT NULL, -- repeated
salary DECIMAL(10,2) NOT NULL,
hire_date DATE NOT NULL,
benefit_plan VARCHAR(50)
);
CREATE TABLE contractors (
contractor_id INT PRIMARY KEY,
name VARCHAR(100) NOT NULL, -- same column, different table
email VARCHAR(150) UNIQUE NOT NULL,
hourly_rate DECIMAL(8,2) NOT NULL,
contract_end_date DATE NOT NULL,
agency_name VARCHAR(100)
);
-- Simple for individual subtype queries.
-- Messy for cross-subtype operations.Aggregation — Relationships Involving Relationships
Aggregation addresses a specific limitation of the basic ER model: you cannot directly have a relationship between an entity type and a relationship type. But some real-world scenarios require exactly this. Aggregation treats a relationship type and the entity types it connects as a higher-level abstract entity — allowing another relationship to connect to this aggregated unit.
// Scenario: Track which MANAGER approved which EMPLOYEE-PROJECT ASSIGNMENT
//
// We have:
// EMPLOYEE "works on" PROJECT (a relationship)
// MANAGER "approves" ???
//
// We want to say: a MANAGER approves a specific EMPLOYEE-PROJECT assignment
// But we can't directly have MANAGER relating to the "works on" relationship.
//
// Solution: Treat the EMPLOYEE-WORKS_ON-PROJECT aggregate as one abstract entity
// Then: MANAGER "approves" [EMPLOYEE-WORKS_ON-PROJECT]
-- In the relational schema, aggregation becomes:
CREATE TABLE employee_project_assignments (
employee_id INT NOT NULL,
project_id INT NOT NULL,
start_date DATE NOT NULL,
hours_per_week DECIMAL(4,1),
PRIMARY KEY (employee_id, project_id),
FOREIGN KEY (employee_id) REFERENCES employees(employee_id),
FOREIGN KEY (project_id) REFERENCES projects(project_id)
);
-- The "approval" relationship connects a MANAGER to an ASSIGNMENT:
CREATE TABLE assignment_approvals (
employee_id INT NOT NULL,
project_id INT NOT NULL,
manager_id INT NOT NULL,
approved_date DATE NOT NULL,
approval_note TEXT,
PRIMARY KEY (employee_id, project_id), -- one approval per assignment
FOREIGN KEY (employee_id, project_id)
REFERENCES employee_project_assignments(employee_id, project_id),
FOREIGN KEY (manager_id) REFERENCES employees(employee_id)
);
-- The composite FK (employee_id, project_id) points to the relationship table
-- This is the relational implementation of aggregationComplete ER-to-Relational Mapping — Every Rule with Examples
Once an ER diagram is complete and validated, converting it to a relational schema follows a systematic set of mapping rules. These rules are algorithmic — given a correct ER diagram, the relational schema can be derived mechanically. Understanding these rules deeply is what separates engineers who can design databases from those who just write SQL.
Complete Worked Example — Designing a Hospital Database
The best way to consolidate ER modelling is to work through a complete, realistic example from requirements to final SQL schema. We will design a hospital database management system — a domain complex enough to showcase every ER concept but familiar enough to evaluate our design decisions intuitively.
Step 1 — Define the Miniworld and Requirements
Step 2 — Identify Entities, Attributes, and Relationships
ENTITY TYPES IDENTIFIED:
WARD — Strong entity. Key: ward_id
DOCTOR — Strong entity. Key: doctor_id. Multi-valued: qualifications
PATIENT — Strong entity. Key: patient_id
MEDICATION — Strong entity. Key: medication_id
BED_ASSIGNMENT — Weak entity. Owner: WARD + PATIENT. Partial key: admission_date
ATTRIBUTES:
WARD: ward_id(key), ward_name, wing, floor_number, capacity
DOCTOR: doctor_id(key), full_name, specialisation, phone, {qualifications}(multi-valued)
PATIENT: patient_id(key), full_name, date_of_birth, age(derived), blood_group,
emergency_contact_name, emergency_contact_phone
// Note: emergency contact is composite → separate columns
MEDICATION: medication_id(key), med_name, manufacturer, unit_cost, stock_quantity
RELATIONSHIP TYPES:
ADMITTED_TO: PATIENT (N) ↔ WARD (1)
→ 1:N. One ward has many admitted patients.
→ Each patient is in at most one ward.
→ Relationship attribute: admission_date, expected_discharge_date
TREATS: DOCTOR (M) ↔ PATIENT (N)
→ M:N. Many doctors treat each patient, each doctor treats many patients.
→ Relationship attribute: treatment_start_date
PRESCRIBED: DOCTOR (M) ↔ PATIENT (N) through MEDICATION (connected three-way)
→ This is a TERNARY relationship involving DOCTOR, PATIENT, MEDICATION
→ Relationship attributes: dosage, frequency, start_date, end_dateStep 3 — Apply Mapping Rules → Complete SQL Schema
-- ─────────────────────────────────────────────
-- Rule 1: Strong entity types → tables
-- ─────────────────────────────────────────────
CREATE TABLE wards (
ward_id SERIAL PRIMARY KEY,
ward_name VARCHAR(100) UNIQUE NOT NULL,
wing VARCHAR(10) NOT NULL, -- 'A', 'B', 'C', 'North', 'South'
floor_number INT NOT NULL,
capacity INT NOT NULL CHECK (capacity > 0)
);
CREATE TABLE doctors (
doctor_id SERIAL PRIMARY KEY,
full_name VARCHAR(150) NOT NULL,
specialisation VARCHAR(100) NOT NULL,
phone VARCHAR(20) UNIQUE NOT NULL
-- qualifications is multi-valued → separate table (Rule 6)
-- age is derived → not stored, computed from date_of_birth if relevant
);
CREATE TABLE patients (
patient_id SERIAL PRIMARY KEY,
full_name VARCHAR(150) NOT NULL,
date_of_birth DATE NOT NULL,
-- age is derived: EXTRACT(YEAR FROM AGE(CURRENT_DATE, date_of_birth))
blood_group CHAR(3) CHECK (blood_group IN ('A+','A-','B+','B-','AB+','AB-','O+','O-')),
-- emergency_contact is composite: stored as separate columns
emergency_contact_name VARCHAR(150),
emergency_contact_phone VARCHAR(20)
);
CREATE TABLE medications (
medication_id SERIAL PRIMARY KEY,
med_name VARCHAR(200) NOT NULL,
manufacturer VARCHAR(150),
unit_cost DECIMAL(10,2) NOT NULL CHECK (unit_cost >= 0),
stock_quantity INT NOT NULL DEFAULT 0 CHECK (stock_quantity >= 0)
);
-- ─────────────────────────────────────────────
-- Rule 6: Multi-valued attribute → separate table
-- ─────────────────────────────────────────────
CREATE TABLE doctor_qualifications (
doctor_id INT NOT NULL,
degree VARCHAR(100) NOT NULL, -- multi-valued attribute
institution VARCHAR(200),
year_awarded INT,
PRIMARY KEY (doctor_id, degree),
FOREIGN KEY (doctor_id) REFERENCES doctors(doctor_id) ON DELETE CASCADE
);
-- ─────────────────────────────────────────────
-- Rule 4: 1:N relationship (WARD → PATIENT admission)
-- FK goes in PATIENT table (N side) + relationship attributes
-- ─────────────────────────────────────────────
-- Current ward assignment stored directly in patients table
-- (a patient is in ONE ward at a time)
ALTER TABLE patients
ADD COLUMN current_ward_id INT, -- NULL if not currently admitted
ADD COLUMN admission_date DATE,
ADD COLUMN expected_discharge DATE,
ADD CONSTRAINT fk_ward
FOREIGN KEY (current_ward_id) REFERENCES wards(ward_id);
-- ─────────────────────────────────────────────
-- Rule 2: Weak entity (BED_ASSIGNMENT) for historical records
-- Owner: WARD + PATIENT. Partial key: admission_date
-- ─────────────────────────────────────────────
CREATE TABLE bed_assignments (
patient_id INT NOT NULL,
ward_id INT NOT NULL,
admission_date DATE NOT NULL, -- partial key (discriminator)
discharge_date DATE, -- NULL if still admitted
bed_number INT,
PRIMARY KEY (patient_id, ward_id, admission_date),
FOREIGN KEY (patient_id) REFERENCES patients(patient_id) ON DELETE CASCADE,
FOREIGN KEY (ward_id) REFERENCES wards(ward_id) ON DELETE RESTRICT
);
-- ─────────────────────────────────────────────
-- Rule 5: M:N relationship (DOCTOR TREATS PATIENT)
-- Junction table with relationship attribute
-- ─────────────────────────────────────────────
CREATE TABLE treatments (
doctor_id INT NOT NULL,
patient_id INT NOT NULL,
treatment_start_date DATE NOT NULL DEFAULT CURRENT_DATE,
treatment_end_date DATE, -- NULL if ongoing
is_primary_doctor BOOLEAN DEFAULT false,
notes TEXT,
PRIMARY KEY (doctor_id, patient_id),
-- Note: if same doctor-patient pair can have multiple treatment episodes,
-- add treatment_start_date to the PK:
-- PRIMARY KEY (doctor_id, patient_id, treatment_start_date)
FOREIGN KEY (doctor_id) REFERENCES doctors(doctor_id) ON DELETE RESTRICT,
FOREIGN KEY (patient_id) REFERENCES patients(patient_id) ON DELETE CASCADE
);
-- ─────────────────────────────────────────────
-- Rule 7: Ternary relationship (DOCTOR prescribes MEDICATION to PATIENT)
-- Junction table with THREE foreign keys + relationship attributes
-- ─────────────────────────────────────────────
CREATE TABLE prescriptions (
prescription_id SERIAL PRIMARY KEY, -- surrogate PK for this complex table
doctor_id INT NOT NULL,
patient_id INT NOT NULL,
medication_id INT NOT NULL,
-- Relationship attributes:
dosage VARCHAR(100) NOT NULL, -- "500mg", "10ml"
frequency VARCHAR(100) NOT NULL, -- "twice daily", "every 8 hours"
start_date DATE NOT NULL DEFAULT CURRENT_DATE,
end_date DATE, -- NULL if ongoing/indefinite
instructions TEXT,
FOREIGN KEY (doctor_id) REFERENCES doctors(doctor_id) ON DELETE RESTRICT,
FOREIGN KEY (patient_id) REFERENCES patients(patient_id) ON DELETE CASCADE,
FOREIGN KEY (medication_id) REFERENCES medications(medication_id) ON DELETE RESTRICT,
-- A doctor cannot prescribe the same medication to the same patient twice
-- on the same start date (use surrogate PK + this constraint):
UNIQUE (doctor_id, patient_id, medication_id, start_date)
);Step 4 — Validate the Schema Against Requirements
-- R1: "Which ward has the most patients currently admitted?"
SELECT w.ward_name, w.wing, w.capacity,
COUNT(p.patient_id) AS current_patients,
w.capacity - COUNT(p.patient_id) AS available_beds
FROM wards w
LEFT JOIN patients p ON w.ward_id = p.current_ward_id
GROUP BY w.ward_id, w.ward_name, w.wing, w.capacity
ORDER BY current_patients DESC;
-- R2: "What are Dr. Sharma's qualifications?"
SELECT d.full_name, dq.degree, dq.institution, dq.year_awarded
FROM doctors d
JOIN doctor_qualifications dq ON d.doctor_id = dq.doctor_id
WHERE d.full_name = 'Dr. Sharma';
-- R4: "Which patients are currently admitted to Ward 3A?"
SELECT p.full_name, p.blood_group, p.admission_date, p.expected_discharge
FROM patients p
JOIN wards w ON p.current_ward_id = w.ward_id
WHERE w.ward_name = '3A'
ORDER BY p.admission_date;
-- R5: "Which doctors are treating patient P-1001, and when did they start?"
SELECT d.full_name, d.specialisation, t.treatment_start_date, t.is_primary_doctor
FROM treatments t
JOIN doctors d ON t.doctor_id = d.doctor_id
WHERE t.patient_id = 1001
ORDER BY t.is_primary_doctor DESC, t.treatment_start_date;
-- R8: "Full prescription history for patient P-1001"
SELECT d.full_name AS prescribed_by, m.med_name, pr.dosage,
pr.frequency, pr.start_date, pr.end_date,
CASE WHEN pr.end_date IS NULL THEN 'Active' ELSE 'Completed' END AS status
FROM prescriptions pr
JOIN doctors d ON pr.doctor_id = d.doctor_id
JOIN medications m ON pr.medication_id = m.medication_id
WHERE pr.patient_id = 1001
ORDER BY pr.start_date DESC;Common ER Design Mistakes — And How to Fix Them
The Schema Design Interview — Design a Database for Zomato
The most common DBMS task in a product company interview is the schema design question: "Design the database for [familiar product]." These are evaluated on your ability to identify the right entities, define appropriate relationships, choose correct cardinalities, and justify your decisions. Here is a complete walkthrough of how a senior engineer approaches this question.
Interviewer: "Design the core database schema for a food delivery platform like Zomato. You have 20 minutes."
Correct approach — think aloud, structure your answer:
Core entities I can immediately identify:
USER — customers ordering food
RESTAURANT — businesses listed on the platform
MENU_ITEM — individual dishes/products restaurants sell
ORDER — a user's food order to a restaurant
ORDER_ITEM — individual dishes within an order (weak entity of ORDER)
DELIVERY_AGENT — person delivering the order
ADDRESS — delivery addresses (users can have multiple)
REVIEW — user reviews of restaurants or delivery
Extended entities (depends on scope):
CATEGORY — cuisine types (Indian, Chinese, Italian)
COUPON — promotional discount codes
PAYMENT — payment transaction detailsUSER → ADDRESS: 1:N (one user has many saved addresses)
USER → ORDER: 1:N (one user places many orders)
RESTAURANT → MENU_ITEM: 1:N (one restaurant has many menu items)
ORDER → RESTAURANT: N:1 (many orders to one restaurant per order)
ORDER → ORDER_ITEM: 1:N (one order has many order items — weak entity!)
ORDER_ITEM → MENU_ITEM: N:1 (many order items reference same menu item)
ORDER → DELIVERY_AGENT: N:1 (many orders delivered by agents over time)
USER → RESTAURANT (REVIEW): M:N (a user can review many restaurants, a restaurant has many reviews)
MENU_ITEM → CATEGORY: M:N (a dish can belong to multiple categories: Biryani is Indian AND Rice)CREATE TABLE users (
user_id SERIAL PRIMARY KEY,
name VARCHAR(100) NOT NULL,
phone VARCHAR(20) UNIQUE NOT NULL,
email VARCHAR(150) UNIQUE NOT NULL,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
CREATE TABLE addresses (
address_id SERIAL PRIMARY KEY,
user_id INT NOT NULL, -- FK on N side (1:N)
label VARCHAR(50) DEFAULT 'Home', -- Home, Work, Other
street VARCHAR(200) NOT NULL,
city VARCHAR(100) NOT NULL,
pincode CHAR(6) NOT NULL,
latitude DECIMAL(10,8),
longitude DECIMAL(11,8),
is_default BOOLEAN DEFAULT false,
FOREIGN KEY (user_id) REFERENCES users(user_id) ON DELETE CASCADE
);
CREATE TABLE restaurants (
restaurant_id SERIAL PRIMARY KEY,
name VARCHAR(200) NOT NULL,
phone VARCHAR(20),
address TEXT NOT NULL,
city VARCHAR(100) NOT NULL,
latitude DECIMAL(10,8),
longitude DECIMAL(11,8),
avg_rating DECIMAL(3,2) DEFAULT 0, -- derived, maintained by trigger
is_open BOOLEAN DEFAULT true,
opens_at TIME,
closes_at TIME
);
CREATE TABLE menu_items (
item_id SERIAL PRIMARY KEY,
restaurant_id INT NOT NULL, -- FK on N side (1:N)
name VARCHAR(200) NOT NULL,
description TEXT,
price DECIMAL(10,2) NOT NULL CHECK (price >= 0),
is_veg BOOLEAN NOT NULL DEFAULT false,
is_available BOOLEAN DEFAULT true,
FOREIGN KEY (restaurant_id) REFERENCES restaurants(restaurant_id) ON DELETE CASCADE
);
CREATE TABLE delivery_agents (
agent_id SERIAL PRIMARY KEY,
name VARCHAR(100) NOT NULL,
phone VARCHAR(20) UNIQUE NOT NULL,
vehicle VARCHAR(50), -- 'bicycle', 'scooter', 'motorcycle'
is_active BOOLEAN DEFAULT true
);
CREATE TABLE orders (
order_id SERIAL PRIMARY KEY,
user_id INT NOT NULL,
restaurant_id INT NOT NULL,
delivery_agent_id INT, -- assigned after order accepted; NULL initially
delivery_address_id INT NOT NULL,
status VARCHAR(30) DEFAULT 'pending',
-- pending → confirmed → preparing → picked_up → delivered | cancelled
placed_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
estimated_delivery TIMESTAMP,
delivered_at TIMESTAMP,
subtotal DECIMAL(10,2) NOT NULL, -- sum of items
delivery_fee DECIMAL(8,2) DEFAULT 30,
discount DECIMAL(8,2) DEFAULT 0,
total DECIMAL(10,2) NOT NULL,
payment_method VARCHAR(30), -- 'UPI', 'card', 'cash', 'wallet'
payment_status VARCHAR(20) DEFAULT 'pending',
FOREIGN KEY (user_id) REFERENCES users(user_id),
FOREIGN KEY (restaurant_id) REFERENCES restaurants(restaurant_id),
FOREIGN KEY (delivery_agent_id) REFERENCES delivery_agents(agent_id),
FOREIGN KEY (delivery_address_id) REFERENCES addresses(address_id)
);
-- ORDER_ITEM is a WEAK ENTITY (depends on ORDER for identity)
CREATE TABLE order_items (
order_id INT NOT NULL, -- FK to owner (ORDER)
line_no INT NOT NULL, -- partial key
item_id INT NOT NULL,
item_name VARCHAR(200) NOT NULL, -- snapshot at time of order (menu may change)
unit_price DECIMAL(10,2) NOT NULL, -- price at time of order (not current price)
quantity INT NOT NULL CHECK (quantity > 0),
special_instructions TEXT,
PRIMARY KEY (order_id, line_no), -- composite PK = owner PK + partial key
FOREIGN KEY (order_id) REFERENCES orders(order_id) ON DELETE CASCADE,
FOREIGN KEY (item_id) REFERENCES menu_items(item_id)
);
-- M:N relationship: USER reviews RESTAURANT
CREATE TABLE reviews (
review_id SERIAL PRIMARY KEY,
user_id INT NOT NULL,
restaurant_id INT NOT NULL,
order_id INT UNIQUE, -- one review per order
rating INT NOT NULL CHECK (rating BETWEEN 1 AND 5),
comment TEXT,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
UNIQUE (user_id, restaurant_id, order_id),
FOREIGN KEY (user_id) REFERENCES users(user_id),
FOREIGN KEY (restaurant_id) REFERENCES restaurants(restaurant_id),
FOREIGN KEY (order_id) REFERENCES orders(order_id)
);🎯 Key Takeaways
- ✓The ER model is a conceptual design tool — it models the real world before touching SQL. It consists of three building blocks: entities (things), attributes (properties), and relationships (connections).
- ✓Strong entities exist independently and have their own primary key. Weak entities depend on an owner (identifying) entity for their identity. A weak entity's PK = owner's PK + partial key (discriminator).
- ✓Attributes have five types: simple (atomic, one column), composite (store sub-components as separate columns if queried independently), multi-valued (always a separate table), derived (compute dynamically or store with a trigger), and key (the unique identifier).
- ✓Cardinality is the most critical design decision: 1:1 (FK in either table or merge), 1:N (FK always on the many side), M:N (always requires a junction table with composite PK). Getting cardinality wrong creates schemas that cannot represent real data.
- ✓Participation constraints define minimum participation: total participation (every instance must participate) → NOT NULL constraint. Partial participation (participation is optional) → nullable FK.
- ✓Extended ER: Generalisation (bottom-up — extract common attributes into supertype), Specialisation (top-down — define subtypes with specific attributes). Implemented as single table (nullable columns), table per type (JOIN required), or table per concrete type (duplication).
- ✓Aggregation allows a relationship to involve another relationship type — implemented as a junction table with a composite FK pointing to another junction table.
- ✓ER-to-relational mapping follows 7 deterministic rules. The rules are algorithmic: strong entity → table, weak entity → table with composite PK, M:N → junction table, multi-valued → separate table. No guesswork required.
- ✓Design validation: every business requirement should map to a queryable combination of tables. If a question cannot be answered without changing the schema, the ER model is incomplete.
- ✓In schema design interviews: start by identifying entities (nouns), then define cardinalities (ask both directions), then identify relationship attributes, then apply mapping rules. A structured answer demonstrates DBMS maturity far more than jumping straight to CREATE TABLE.
Discussion
0Have a better approach? Found something outdated? Share it — your knowledge helps everyone learning here.