Python · SQL · Web Dev · Java · AI/ML tracks launching soon — your one platform for all of IT
Advanced

Recursive CTEs

Query hierarchical and graph data without fixed-depth self-joins — org charts, category trees, bill-of-materials, path finding, and number generation using WITH RECURSIVE

16–24 min April 2026
Section 11 · Window Functions & CTEs
Window Functions & CTEs · 5 modulesModule 56

// Part 01

The Problem — Hierarchies Have Unknown Depth

Most SQL queries work on flat data — each row is independent. But real-world data is often hierarchical: an employee reports to a manager who reports to a director who reports to a VP. A product category contains subcategories which contain sub-subcategories. A delivery route visits nodes that connect to other nodes.

Querying these structures with ordinary SQL requires knowing the depth in advance. To find all employees under a VP you would write a chain of self-joins — one join per level. This breaks the moment the hierarchy deepens or shallows. Recursive CTEs solve this cleanly: they repeat a query against their own previous output until no new rows are produced, traversing hierarchies of any depth without knowing the depth in advance.

The problem a recursive CTE solves
-- WITHOUT recursive CTE: fixed-depth self-joins (brittle)
-- Works only if hierarchy is exactly 3 levels deep
SELECT e1.name AS level1, e2.name AS level2, e3.name AS level3
FROM employees e1
JOIN employees e2 ON e2.manager_id = e1.employee_id
JOIN employees e3 ON e3.manager_id = e2.employee_id
WHERE e1.manager_id IS NULL;
-- Breaks when someone adds a 4th level

-- WITH RECURSIVE: traverses any depth automatically
WITH RECURSIVE org_tree AS (
  -- Anchor: start at the root (no manager)
  SELECT employee_id, name, manager_id, 0 AS depth
  FROM employees
  WHERE manager_id IS NULL

  UNION ALL

  -- Recursive: join each employee to their manager's row
  SELECT e.employee_id, e.name, e.manager_id, ot.depth + 1
  FROM employees AS e
  JOIN org_tree   AS ot ON e.manager_id = ot.employee_id
)
SELECT * FROM org_tree ORDER BY depth, name;
-- Works for 1 level or 100 levels — same query

// Part 02

Anatomy of a Recursive CTE

A recursive CTE has exactly two parts separated by UNION ALL: the anchor member (a regular SELECT that starts the recursion) and the recursive member (a SELECT that references the CTE itself, joining the previous iteration's output to the base table). The database alternates between running the recursive member and accumulating results until the recursive member produces zero new rows.

Recursive CTE structure — every part explained
WITH RECURSIVE cte_name AS (

  -- ── ANCHOR MEMBER ───────────────────────────────────────────
  -- Runs exactly ONCE. Seeds the recursion.
  -- Returns the starting rows (roots, first elements, initial values).
  -- Must NOT reference cte_name.
  SELECT col1, col2, 1 AS depth, col1::TEXT AS path
  FROM base_table
  WHERE starting_condition

  UNION ALL       -- always UNION ALL (not UNION) in recursive CTEs

  -- ── RECURSIVE MEMBER ────────────────────────────────────────
  -- Runs REPEATEDLY until it produces 0 new rows.
  -- References cte_name — joins the PREVIOUS iteration's output.
  -- Each iteration sees only the rows added by the PREVIOUS iteration.
  SELECT bt.col1, bt.col2, r.depth + 1, r.path || ' > ' || bt.col1
  FROM base_table AS bt
  JOIN cte_name   AS r   -- r = previous iteration's rows
    ON bt.parent_id = r.col1
  WHERE r.depth < 10     -- ALWAYS add a depth/cycle guard

)
-- Final SELECT: reads ALL accumulated rows from all iterations
SELECT * FROM cte_name ORDER BY depth;

-- Execution model:
-- Iteration 0: anchor runs → produces root rows → added to result
-- Iteration 1: recursive runs with iteration 0 rows → adds children
-- Iteration 2: recursive runs with iteration 1 rows → adds grandchildren
-- ...continues until recursive produces 0 rows
-- Final result: UNION ALL of all iterations
⚠️ Important
Always include a termination condition in the recursive member — either a WHERE depth < N guard or a WHERE NOT EXISTS cycle check. Without it, a cycle in the data (A → B → A) causes infinite recursion. PostgreSQL enforces a maximum recursion depth (default 100) but hitting it raises an error rather than returning partial results.

// Part 03

Number and Date Series Generation

The simplest recursive CTE generates a series of numbers or dates — anchoring at a start value and incrementing each iteration. This is a clean alternative to generate_series() and works across all SQL databases.

Integer series

Loading FreshCart DB…
Ctrl + Enter to run
Loading FreshCart database in your browser…

Date series — fill calendar gaps

Loading FreshCart DB…
Ctrl + Enter to run
Loading FreshCart database in your browser…
Loading FreshCart DB…
Ctrl + Enter to run
Loading FreshCart database in your browser…

// Part 04

Hierarchical Data — Org Charts and Category Trees

The classic recursive CTE use case: traversing a parent-child relationship stored in a self-referencing table. The same table has both the child rows and the parent rows — the recursive CTE walks from root to leaves (top-down) or from leaf to root (bottom-up).

FreshCart employee org chart — top-down traversal

Loading FreshCart DB…
Ctrl + Enter to run
Loading FreshCart database in your browser…

Bottom-up traversal — find all ancestors of an employee

Loading FreshCart DB…
Ctrl + Enter to run
Loading FreshCart database in your browser…

Count reports at each level

Loading FreshCart DB…
Ctrl + Enter to run
Loading FreshCart database in your browser…

// Part 05

Product Category Trees — Multi-Level Navigation

E-commerce category hierarchies are a textbook recursive CTE use case. Electronics → Mobiles → Smartphones → Android Phones. The same self-referencing table stores every level. The recursive CTE walks the tree to produce breadcrumb paths, flattened lists, or roll-up totals.

Category tree — schema and recursive traversal
-- Self-referencing categories table:
-- CREATE TABLE categories (
--   category_id   INTEGER PRIMARY KEY,
--   category_name TEXT NOT NULL,
--   parent_id     INTEGER REFERENCES categories(category_id),
--   level         INTEGER DEFAULT 0
-- );

-- Top-down traversal with breadcrumb path
WITH RECURSIVE category_tree AS (
  -- Anchor: root categories (no parent)
  SELECT
    category_id,
    category_name,
    parent_id,
    0                         AS depth,
    category_name::TEXT       AS breadcrumb,
    category_id::TEXT         AS sort_path
  FROM categories
  WHERE parent_id IS NULL

  UNION ALL

  SELECT
    c.category_id,
    c.category_name,
    c.parent_id,
    ct.depth + 1,
    ct.breadcrumb || ' > ' || c.category_name,
    ct.sort_path || '.' || LPAD(c.category_id::TEXT, 6, '0')
  FROM categories  AS c
  JOIN category_tree AS ct ON c.parent_id = ct.category_id
  WHERE ct.depth < 8
)
SELECT
  REPEAT('  ', depth) || category_name   AS indented_name,
  breadcrumb,
  depth                                  AS level
FROM category_tree
ORDER BY sort_path;
Loading FreshCart DB…
Ctrl + Enter to run
Loading FreshCart database in your browser…

// Part 06

Path Finding — Shortest Route Between Nodes

Recursive CTEs can traverse graph structures — tables where rows represent edges (connections between nodes). The classic application is finding paths between stores, delivery waypoints, or network nodes. The recursive member builds paths by adding one edge per iteration.

Graph traversal — path finding between nodes
-- Delivery route graph:
-- CREATE TABLE routes (
--   from_node TEXT,
--   to_node   TEXT,
--   distance  INTEGER
-- );

WITH RECURSIVE paths AS (
  -- Anchor: start at the origin node, zero distance
  SELECT
    from_node,
    to_node,
    distance,
    ARRAY[from_node, to_node]     AS visited_nodes,
    distance                      AS total_distance,
    1                             AS hops
  FROM routes
  WHERE from_node = 'Seattle'

  UNION ALL

  SELECT
    r.from_node,
    r.to_node,
    r.distance,
    p.visited_nodes || r.to_node,
    p.total_distance + r.distance,
    p.hops + 1
  FROM routes AS r
  JOIN paths  AS p ON r.from_node = p.to_node
  -- Cycle prevention: don't revisit a node already in the path
  WHERE NOT (r.to_node = ANY(p.visited_nodes))
    AND p.hops < 10
)
SELECT
  visited_nodes[1]              AS origin,
  to_node                       AS destination,
  array_to_string(visited_nodes, ' → ') AS full_path,
  total_distance,
  hops
FROM paths
WHERE to_node = 'New York'
ORDER BY total_distance
LIMIT 5;
Loading FreshCart DB…
Ctrl + Enter to run
Loading FreshCart database in your browser…

// Part 07

Bill of Materials — Exploding Nested Components

A bill of materials (BOM) is a hierarchical parts list: a product contains components, each component contains sub-components, and so on. Recursive CTEs explode BOMs to compute total quantities and costs at every level — without knowing how many levels deep the nesting goes.

Bill of materials explosion
-- BOM table: each row = one parent contains N units of one child
-- CREATE TABLE bom (
--   parent_id   INTEGER,
--   child_id    INTEGER,
--   quantity    NUMERIC,
--   unit_cost   NUMERIC
-- );

WITH RECURSIVE bom_explosion AS (
  -- Anchor: start with the top-level product (no parent)
  SELECT
    child_id          AS component_id,
    child_id::TEXT    AS path,
    quantity,
    quantity          AS total_quantity,
    unit_cost,
    quantity * unit_cost AS total_cost,
    0                 AS depth
  FROM bom
  WHERE parent_id = 101   -- product ID 101

  UNION ALL

  SELECT
    b.child_id,
    be.path || ' > ' || b.child_id::TEXT,
    b.quantity,
    -- Multiply child quantity by parent's total quantity
    be.total_quantity * b.quantity,
    b.unit_cost,
    be.total_quantity * b.quantity * b.unit_cost,
    be.depth + 1
  FROM bom            AS b
  JOIN bom_explosion  AS be ON b.parent_id = be.component_id
  WHERE be.depth < 8
)
SELECT
  component_id,
  path,
  depth,
  quantity           AS qty_per_parent,
  total_quantity     AS total_qty_needed,
  unit_cost,
  ROUND(total_cost, 2) AS total_cost
FROM bom_explosion
ORDER BY path;

-- Roll up: total cost of the entire product
SELECT SUM(total_cost) AS total_product_cost
FROM bom_explosion;
Loading FreshCart DB…
Ctrl + Enter to run
Loading FreshCart database in your browser…
Loading FreshCart DB…
Ctrl + Enter to run
Loading FreshCart database in your browser…

// Part 08

Cycle Detection and Prevention

A cycle occurs when following parent-child links eventually leads back to a node already visited — A is parent of B, B is parent of C, C is parent of A. Without protection, the recursive CTE loops infinitely. Two mechanisms prevent this: a depth limit (simple but imprecise) and a visited-array check (precise cycle detection).

Cycle prevention — depth limit vs visited array
-- Method 1: depth limit (simple — stops after N iterations)
WITH RECURSIVE safe_traverse AS (
  SELECT id, parent_id, name, 0 AS depth
  FROM nodes WHERE parent_id IS NULL

  UNION ALL

  SELECT n.id, n.parent_id, n.name, st.depth + 1
  FROM nodes         AS n
  JOIN safe_traverse AS st ON n.parent_id = st.id
  WHERE st.depth < 20    -- stops at depth 20 regardless of cycles
)
SELECT * FROM safe_traverse;
-- Risk: if a cycle exists deeper than 20 levels, it is silently cut off
-- But for most real hierarchies, depth 20 is a safe upper bound

-- Method 2: visited array (precise — detects actual cycles)
WITH RECURSIVE cycle_safe AS (
  SELECT id, parent_id, name,
         ARRAY[id] AS visited    -- track IDs seen in this path
  FROM nodes WHERE parent_id IS NULL

  UNION ALL

  SELECT n.id, n.parent_id, n.name,
         cs.visited || n.id
  FROM nodes      AS n
  JOIN cycle_safe AS cs ON n.parent_id = cs.id
  WHERE NOT (n.id = ANY(cs.visited))  -- stop if we've seen this ID before
)
SELECT * FROM cycle_safe;
-- Precise: stops exactly when a cycle is detected
-- Cost: array grows with depth — slight overhead for deep hierarchies

-- Method 3: both (belt and suspenders for production)
WHERE NOT (n.id = ANY(cs.visited)) AND cs.depth < 50
Loading FreshCart DB…
Ctrl + Enter to run
Loading FreshCart database in your browser…

// Part 09

Aggregating Over Hierarchies — Rollup Totals

Once a recursive CTE has mapped the full hierarchy, joining it back to fact data enables rollup aggregations — total sales for a category including all its subcategories, total headcount under each manager including all indirect reports.

Loading FreshCart DB…
Ctrl + Enter to run
Loading FreshCart database in your browser…

// Part 10

What This Looks Like at Work

You are a data engineer at Amazon. The supply chain team needs a complete product component cost report — for any product, show every raw component it contains at any depth, the cumulative quantity needed, and the total cost contribution. The product hierarchy is 3–5 levels deep and changes weekly. They need this as an on-demand SQL query, not a hardcoded report.

2:00 PM
Requirement: explode any product's BOM on demand
Given a product_id, return all components at every depth, their cumulative quantities (multiplied through the hierarchy), and total cost. Works for any product regardless of depth.
2:30 PM
Design: recursive CTE anchors at the product, recurses into components
Anchor = direct components of the product. Recursive = sub-components of each component. Each iteration multiplies the child quantity by the parent's cumulative quantity. Depth guard prevents runaway recursion.
Loading FreshCart DB…
Ctrl + Enter to run
Loading FreshCart database in your browser…
Loading FreshCart DB…
Ctrl + Enter to run
Loading FreshCart database in your browser…
3:30 PM
Report complete — works for any product, any depth
The recursive CTE handles a 3-level hierarchy today and a 6-level hierarchy next quarter without any query changes. The supply chain team parameterises it by changing the basket_id value. The cumulative quantity multiplication propagates correctly through every level — 3 Juice Packs × 3 Bottles each = 9 total bottles, correctly computed at depth 1.

🎯 Pro Tip

When building recursive CTEs for BOM explosion, always track cumulative_qty by multiplying the child's qty_per_parent by the parent's cumulative_qty at each level — not just the qty_at_level. A 2-unit parent containing 3-unit children means 6 children total, not 3. The multiplication must compound through every level of the hierarchy. Test the cumulative_qty column first on a small known BOM before running on production data.

// Part 11

Interview Prep — 5 Questions With Complete Answers

Q: What is a recursive CTE and when would you use one?

A recursive CTE is a Common Table Expression that references itself in its own definition. It has two parts separated by UNION ALL: an anchor member (a regular SELECT that runs once and seeds the recursion) and a recursive member (a SELECT that references the CTE by name and joins the previous iteration's output to the base table). The database alternates between running the recursive member against the previous iteration's rows and accumulating results, until the recursive member produces zero new rows.

Use a recursive CTE when the data has a hierarchical or graph structure stored in a self-referencing table — a parent_id column that points back to the same table's primary key. The key advantage over self-joins: recursive CTEs traverse hierarchies of unknown depth. A fixed chain of self-joins works only when you know the exact number of levels in advance. Recursive CTEs adapt automatically to any depth.

Common use cases: organisational hierarchies (employee → manager relationships), product category trees (category → parent category), bill of materials (product → components → sub-components), network/graph traversal (finding paths between nodes), and generating series (sequences of numbers or dates). Any time you find yourself writing a chain of self-joins to handle levels of a hierarchy, replace it with a recursive CTE — the query becomes shorter, more correct, and depth-independent.

Q: Explain the anchor and recursive members of a recursive CTE.

The anchor member is the first SELECT in the recursive CTE — the one that does not reference the CTE itself. It runs exactly once when the CTE begins executing and produces the initial set of rows that seeds the recursion. For a top-down hierarchy traversal, the anchor typically selects root nodes (WHERE parent_id IS NULL). For a bottom-up traversal, the anchor selects a specific leaf node to start from. For a number series, the anchor selects the starting value (SELECT 1 AS n).

The recursive member is the second SELECT — it references the CTE by name and joins the CTE's rows (which represent the previous iteration's output) to the base table to find the next level of rows. Each iteration of the recursive member sees only the rows produced by the immediately preceding iteration — not all accumulated rows. The recursive member must produce fewer and fewer rows as the recursion progresses, eventually producing zero rows to terminate.

The two members are connected by UNION ALL (not UNION). Using UNION would deduplicate across iterations, which is both semantically wrong for most hierarchies and significantly slower. The final result of the CTE is the UNION ALL of all rows produced by all iterations — the anchor's rows plus every recursive iteration's rows combined. The outer SELECT that queries the CTE reads this complete accumulated result set.

Q: How do you prevent infinite recursion in a recursive CTE?

Infinite recursion occurs when following parent-child links leads back to a node already visited (a cycle: A → B → C → A) or when the termination condition is never satisfied. Without protection, the recursive member keeps producing rows forever — PostgreSQL enforces a default max_recursion_depth of 100 and raises an error when hit, but the error is less useful than intentional termination.

Two protection mechanisms. Method 1: depth limit — add a depth counter to the anchor (0 AS depth) and increment it in the recursive member (depth + 1). Add WHERE depth < N to the recursive member. The recursion stops after N iterations regardless of data. Simple and cheap, but N must be chosen carefully — too low and valid deep hierarchies are cut off, too high and cycles spin for N iterations before stopping. For most org charts N = 10–20 is safe; for BOMs N = 5–8 is typical.

Method 2: visited array — maintain an array of node IDs seen in the current path. In the anchor, initialise it with ARRAY[id]. In the recursive member, append the new node's ID and add WHERE NOT (new_id = ANY(visited_array)). This stops the moment a cycle is detected, at exactly the right point regardless of depth. Cost: the array grows with depth and IS DISTINCT FROM comparisons are slightly more expensive than integer comparison. For production code on data that might have cycles, use both: the visited array for correctness and a depth limit as a safety backstop.

Q: How do you generate a date series using a recursive CTE?

The pattern: anchor on the start date, recursive member adds one day per iteration, terminate when the date reaches the end date. WITH RECURSIVE date_series AS (SELECT '2024-01-01'::DATE AS dt UNION ALL SELECT dt + 1 FROM date_series WHERE dt < '2024-01-31') SELECT dt FROM date_series.

The primary use of date series in analytics is filling gaps in time-series data. When querying daily revenue, days with no orders produce no rows — the result has gaps. A date series CTE produces every calendar date, and a LEFT JOIN to the revenue data fills in zero for missing days. This ensures reports show a complete timeline without gaps, which is essential for charts and moving averages that depend on consecutive date sequences.

For monthly series, add INTERVAL '1 month' instead of +1: SELECT '2024-01-01'::DATE AS m UNION ALL SELECT (m + INTERVAL '1 month')::DATE FROM months WHERE m < '2024-12-01'. PostgreSQL also provides the built-in generate_series(start, end, interval) function which is more concise for this purpose, but the recursive CTE pattern is portable across all SQL databases that support WITH RECURSIVE (PostgreSQL, SQL Server, MySQL 8+, SQLite 3.35+).

Q: What is the difference between a recursive CTE and a self-join for hierarchical queries?

A self-join joins a table to itself on a specific relationship — typically parent_id = id. A single self-join produces one level of the hierarchy: employees joined to their direct managers. Two self-joins produce two levels. N self-joins produce N levels. The critical limitation: the number of self-joins must be known and hardcoded. If the hierarchy deepens from 4 to 5 levels, the query must be rewritten to add another self-join. This is brittle — any schema change to the depth of the hierarchy requires a query change.

A recursive CTE traverses any depth automatically. The recursive member defines the relationship between one level and the next. The database applies this definition repeatedly until no new rows are produced. A 4-level hierarchy and a 40-level hierarchy use the identical query — only the data depth differs, not the SQL. This makes recursive CTEs the correct tool for any hierarchy where the depth is variable, unknown at query time, or changes over time.

Performance comparison: for shallow, fixed-depth hierarchies (always exactly 2-3 levels), self-joins can be marginally faster because the query plan is simpler and the planner can optimise fully. For variable-depth hierarchies, recursive CTEs are both more correct (they handle any depth) and practically faster (a 5-level self-join requires 5 table scans with Cartesian intermediate results; a recursive CTE performs one scan per level with only the relevant rows). The readability advantage strongly favours recursive CTEs — 10 lines replacing 50 lines of chained self-joins, with no depth limitation.

// Part 12

Errors You Will Hit — And Exactly Why They Happen

ERROR: WITH RECURSIVE query has no self-reference — recursive term missing

Cause: The recursive member of the CTE does not reference the CTE's own name. A recursive CTE requires the second SELECT (after UNION ALL) to join to the CTE itself — that is what makes it recursive. Without the self-reference, PostgreSQL raises this error because the CTE structure is syntactically recursive (WITH RECURSIVE) but semantically it is just a regular CTE.

Fix: Ensure the recursive member contains a FROM or JOIN that references the CTE by name: WITH RECURSIVE org AS (SELECT ... FROM employees WHERE manager_id IS NULL UNION ALL SELECT e.* FROM employees AS e JOIN org ON e.manager_id = org.employee_id). The JOIN org ON ... is the self-reference. If the query genuinely does not need recursion, remove RECURSIVE from WITH RECURSIVE — a non-recursive CTE does not require the keyword and can use UNION instead of UNION ALL.

ERROR: maximum recursion depth exceeded (100)

Cause: The recursive CTE ran 100 iterations without the recursive member producing zero rows. Either the data contains a cycle (A is parent of B, B is parent of A), there is no termination condition in the WHERE clause of the recursive member, or the hierarchy is genuinely deeper than 100 levels.

Fix: Add a depth counter and termination condition: 0 AS depth in the anchor, depth + 1 in the recursive member, and WHERE depth < 50 in the recursive member's WHERE clause. For cycle detection, add a visited array: ARRAY[id] in the anchor, visited || new_id in the recursive member, and WHERE NOT (new_id = ANY(visited)) in the WHERE clause. To increase the limit temporarily (not recommended for production): SET max_recursion_depth = 200. Always prefer fixing the data or adding a proper termination condition over raising the limit.

Recursive CTE returns duplicate rows — same node appears multiple times

Cause: The hierarchy data contains a node with multiple parents (a diamond pattern: A → B → D and A → C → D), causing the recursive member to reach node D twice via different paths. In a true tree (each node has exactly one parent), this should not happen — its presence indicates either bad data or that the data is a directed acyclic graph (DAG) rather than a tree.

Fix: If the data is a DAG (multiple parents allowed), use UNION instead of UNION ALL between anchor and recursive members — UNION deduplicates and ensures each node appears only once in the result. Note: UNION is slower than UNION ALL because deduplication requires sorting or hashing. Alternatively, use a visited array to skip nodes already accumulated. If the data should be a tree and duplicates indicate bad data, audit the parent_id assignments: SELECT child_id, COUNT(DISTINCT parent_id) FROM hierarchy_table GROUP BY child_id HAVING COUNT(DISTINCT parent_id) > 1 — any result here is a node with multiple parents.

BOM explosion produces wrong cumulative quantities — sub-component totals are incorrect

Cause: The cumulative quantity is not being multiplied through the hierarchy correctly. A common mistake: using qty_per_parent instead of cumulative_qty * qty_per_parent for the child's cumulative quantity. If a product contains 3 packs and each pack contains 4 bottles, the correct total is 12 bottles — but if the recursive member uses only 4 (qty_per_parent of the pack) instead of 3 × 4 = 12 (parent's cumulative_qty × child's qty_per_parent), the total is wrong.

Fix: In the recursive member, compute the child's cumulative quantity as parent.cumulative_qty * child.qty_per_parent — always multiply by the parent's cumulative quantity, not just the level's quantity. Verify with a small known BOM: manually calculate the expected quantities for each component and compare to the CTE output. Add a CHECK column to the output showing the multiplication: parent_cumulative_qty || ' × ' || qty_per_parent || ' = ' || cumulative_qty to make the calculation visible and auditable.

Date series CTE generates incorrect dates — skips days or produces wrong range

Cause: The termination condition uses a strict less-than (<) when it should include the end date, or the date arithmetic is using INTERVAL in a way that skips non-existent dates (e.g., adding months to January 31 produces March 2 instead of February 28). Alternatively, the cast from INTERVAL addition back to DATE is missing, causing a TIMESTAMP result that does not match a DATE column in the LEFT JOIN.

Fix: For daily series: use WHERE dt <= '2024-01-31' (inclusive) or WHERE dt < '2024-02-01' (exclusive boundary). Both include January 31. For monthly series: use (month_start + INTERVAL '1 month')::DATE — the ::DATE cast is essential. PostgreSQL's INTERVAL '1 month' handles month-end correctly: '2024-01-31'::DATE + INTERVAL '1 month' = '2024-02-29' (in a leap year), not March 2. Always cast the result back to DATE to ensure the join condition with DATE columns works correctly.

Try It Yourself

Build two recursive CTEs using FreshCart data. (1) Write a recursive CTE that generates every date in the range of FreshCart's order data (from the MIN order_date to the MAX order_date), then LEFT JOIN it to orders to produce a complete daily revenue calendar showing: calendar_date, daily_revenue (0 if no orders that day), order_count (0 if none), and a 3-day moving average of daily_revenue using a window function on the result. (2) Write a recursive CTE that traverses the FreshCart employee table from root (manager_id IS NULL) downward, producing: employee_id, full_name, job_title, depth (0 for root), indented_name (REPEAT spaces + name), full_path (names joined by ' → '), and direct_report_count (how many employees have this person as their manager_id — compute with a subquery or join). Order the org chart by the path so siblings appear together.

🎯 Key Takeaways

  • A recursive CTE has two parts joined by UNION ALL: an anchor member (runs once, seeds the recursion) and a recursive member (references the CTE itself, runs repeatedly until it produces zero rows).
  • The recursive member sees only the rows from the immediately preceding iteration — not all accumulated rows. Each iteration adds a new level of the hierarchy to the accumulated result.
  • Always include a termination condition: a depth counter with WHERE depth < N (simple) or a visited array with WHERE NOT (id = ANY(visited)) (precise cycle detection). Both together is safest for production.
  • Top-down traversal: anchor on roots (WHERE parent_id IS NULL), recursive member joins children to parents. Bottom-up traversal: anchor on a specific leaf, recursive member joins parent to child via manager_id = previous.employee_id.
  • Date series: anchor on start date, recursive increments by +1 or INTERVAL '1 month'. LEFT JOIN to fact data fills gaps with zero — essential for complete time-series charts and moving averages.
  • Build path strings by concatenating in the recursive member: path || ' → ' || new_name. Build sort keys with LPAD(id, 6, '0') so siblings sort correctly by numeric ID while maintaining lexicographic path order.
  • BOM explosion: cumulative quantity must multiply through every level — child.cumulative_qty = parent.cumulative_qty × child.qty_per_parent. Not just qty_per_parent alone.
  • Cycle prevention with visited array: ARRAY[id] in anchor, visited || new_id in recursive member, WHERE NOT (new_id = ANY(visited)) to stop. Use UNION (not UNION ALL) for DAGs where a node can have multiple parents.
  • Recursive CTEs replace chains of self-joins for hierarchical data. Self-joins require knowing the depth in advance; recursive CTEs handle any depth with the same query.
  • PostgreSQL default max_recursion_depth is 100. Hit it by accident means a cycle or missing termination condition — fix the query, not the limit. SET max_recursion_depth is a last resort.

What comes next

In Module 57, you learn EXPLAIN and Query Optimisation — reading execution plans, identifying bottlenecks, rewriting slow queries, and the systematic approach to making any query faster.

Module 57 → EXPLAIN and Query Optimisation
Share

Discussion

0

Have a better approach? Found something outdated? Share it — your knowledge helps everyone learning here.

Continue with GitHub
Loading...