Python · SQL · Web Dev · Java · AI/ML tracks launching soon — your one platform for all of IT
Advanced

Ranking Functions

Advanced ranking — percentile rank, cumulative distribution, conditional rankings, multi-level leaderboards, and every production pattern where ranking drives business decisions

14–20 min April 2026
Section 11 · Window Functions & CTEs
Window Functions & CTEs · 5 modulesModule 53

// Part 01

The Full Ranking Toolkit

Module 52 introduced ROW_NUMBER, RANK, DENSE_RANK, and NTILE. This module goes deeper — covering the statistical ranking functions PERCENT_RANK and CUME_DIST, the ordered-set aggregate PERCENTILE_CONT, conditional and multi-level ranking, and the production patterns that appear in leaderboards, cohort analysis, and performance tiers.

Complete ranking function reference
-- Positional ranking (covered in Module 52):
ROW_NUMBER() OVER (PARTITION BY grp ORDER BY col)  -- 1,2,3,4 — always unique
RANK()       OVER (PARTITION BY grp ORDER BY col)  -- 1,1,3,4 — gaps after ties
DENSE_RANK() OVER (PARTITION BY grp ORDER BY col)  -- 1,1,2,3 — no gaps
NTILE(n)     OVER (ORDER BY col)                   -- bucket 1..n

-- Statistical / fractional ranking (this module):
PERCENT_RANK() OVER (ORDER BY col)
-- Relative rank 0.0 to 1.0 — what fraction of rows rank below this row
-- Formula: (rank - 1) / (total_rows - 1)
-- First row: 0.0,  Last row: 1.0

CUME_DIST() OVER (ORDER BY col)
-- Cumulative distribution: fraction of rows <= current row's value
-- Formula: rank / total_rows  (including ties — all tied rows same value)
-- Values: 0 < CUME_DIST <= 1.0

-- Ordered-set aggregates (called differently — no OVER()):
PERCENTILE_CONT(0.5)  WITHIN GROUP (ORDER BY col)  -- median (interpolated)
PERCENTILE_DISC(0.5)  WITHIN GROUP (ORDER BY col)  -- median (actual value in dataset)
MODE() WITHIN GROUP (ORDER BY col)                  -- most frequent value

// Part 02

PERCENT_RANK and CUME_DIST — Statistical Position

PERCENT_RANK answers "what percentage of rows rank strictly below this one?" — ranging from 0.0 (lowest) to 1.0 (highest). CUME_DIST answers "what fraction of rows have a value less than or equal to this row's value?" — always greater than 0.0, up to 1.0. Both are useful for expressing a row's position as a relative fraction rather than an absolute rank number.

Loading FreshCart DB…
Ctrl + Enter to run
Loading FreshCart database in your browser…
Loading FreshCart DB…
Ctrl + Enter to run
Loading FreshCart database in your browser…

PERCENT_RANK vs CUME_DIST — the key difference

PERCENT_RANK vs CUME_DIST — formula and interpretation
-- For a set of 5 rows with values: 100, 200, 200, 300, 400

-- PERCENT_RANK: (rank - 1) / (total - 1)
-- Row 100: rank=1 → (1-1)/(5-1) = 0.0    (0% of rows rank below)
-- Row 200: rank=2 → (2-1)/(5-1) = 0.25   (25% rank strictly below)
-- Row 200: rank=2 → (2-1)/(5-1) = 0.25   (same rank = same pct_rank)
-- Row 300: rank=4 → (4-1)/(5-1) = 0.75   (75% rank below)
-- Row 400: rank=5 → (5-1)/(5-1) = 1.0    (100% rank below)

-- CUME_DIST: rows_leq / total_rows
-- Row 100: 1 row ≤ 100  → 1/5 = 0.2
-- Row 200: 3 rows ≤ 200 → 3/5 = 0.6   (both 200s get same value)
-- Row 200: 3 rows ≤ 200 → 3/5 = 0.6
-- Row 300: 4 rows ≤ 300 → 4/5 = 0.8
-- Row 400: 5 rows ≤ 400 → 5/5 = 1.0

-- Key differences:
-- PERCENT_RANK can be 0.0 (first row always 0); CUME_DIST is always > 0
-- PERCENT_RANK: how many rows are STRICTLY BELOW (exclusive)
-- CUME_DIST:    how many rows are <= this value (inclusive)
-- For selecting top N%: use PERCENT_RANK >= (1 - N/100)
--                  e.g. top 20%: PERCENT_RANK >= 0.8
Loading FreshCart DB…
Ctrl + Enter to run
Loading FreshCart database in your browser…

// Part 03

PERCENTILE_CONT and PERCENTILE_DISC — True Percentile Values

PERCENT_RANK and CUME_DIST tell you where a row sits in the distribution. PERCENTILE_CONT and PERCENTILE_DISC answer the inverse question: what value sits at the Nth percentile? These are ordered-set aggregates — they collapse rows like GROUP BY and return a single value representing a specific point in the distribution.

PERCENTILE_CONT vs PERCENTILE_DISC
-- PERCENTILE_CONT: continuous interpolation between adjacent values
-- Returns a value that may not exist in the dataset (interpolated)
PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY total_amount)
-- If the median falls between 850 and 900, returns 875.0

-- PERCENTILE_DISC: discrete — returns the actual nearest value in the dataset
PERCENTILE_DISC(0.5) WITHIN GROUP (ORDER BY total_amount)
-- Returns either 850 or 900 (whichever is the actual value at that position)

-- PERCENTILE_CONT is almost always preferred:
-- - Standard mathematical definition of percentile
-- - Returns smooth values for any fraction 0.0–1.0
-- - PERCENTILE_DISC can surprise: P(0.5) of [1,2,3,4] gives 2, not 2.5

-- Use with GROUP BY for per-group percentiles:
SELECT
  store_id,
  PERCENTILE_CONT(0.25) WITHIN GROUP (ORDER BY total_amount) AS p25,
  PERCENTILE_CONT(0.50) WITHIN GROUP (ORDER BY total_amount) AS median,
  PERCENTILE_CONT(0.75) WITHIN GROUP (ORDER BY total_amount) AS p75,
  PERCENTILE_CONT(0.95) WITHIN GROUP (ORDER BY total_amount) AS p95
FROM orders
WHERE order_status = 'Delivered'
GROUP BY store_id;
Loading FreshCart DB…
Ctrl + Enter to run
Loading FreshCart database in your browser…
Loading FreshCart DB…
Ctrl + Enter to run
Loading FreshCart database in your browser…

// Part 04

Conditional Ranking — Rank Only a Subset

Sometimes you need to rank rows only when they meet a condition — rank only active products, rank only delivered orders, rank employees within their own department. The FILTER clause and CASE expressions extend ranking functions to handle these scenarios cleanly.

Rank within a filtered subset using a CTE

Loading FreshCart DB…
Ctrl + Enter to run
Loading FreshCart database in your browser…
Loading FreshCart DB…
Ctrl + Enter to run
Loading FreshCart database in your browser…

Rank with a FILTER clause — conditional aggregation inside ranking

Loading FreshCart DB…
Ctrl + Enter to run
Loading FreshCart database in your browser…

// Part 05

Multi-Level Rankings — Ranking Within Ranking

Production leaderboards often need hierarchy: rank stores within a city, then cities within a region. Or rank products within a category, then rank categories overall. Multi-level rankings use multiple window functions with different PARTITION BY clauses in the same query.

Loading FreshCart DB…
Ctrl + Enter to run
Loading FreshCart database in your browser…
Loading FreshCart DB…
Ctrl + Enter to run
Loading FreshCart database in your browser…

// Part 06

Leaderboard Patterns — Production-Grade Rankings

Production leaderboards need more than a simple RANK() — they need movement indicators (up/down from last period), badges for sustained performance, and handling of new entrants. These patterns combine ranking functions with LAG for previous-period comparison.

Leaderboard with movement arrows

Loading FreshCart DB…
Ctrl + Enter to run
Loading FreshCart database in your browser…

Sustained performance badge — consistently top-ranked

Loading FreshCart DB…
Ctrl + Enter to run
Loading FreshCart database in your browser…

// Part 07

Percentile-Based Segmentation — Deciles and Quartiles

Percentile-based segmentation divides a continuous distribution into equal-sized groups for analysis. Quartiles (4 groups), deciles (10 groups), and percentiles (100 groups) are standard tools for customer segmentation, product tiering, and performance distribution analysis.

Loading FreshCart DB…
Ctrl + Enter to run
Loading FreshCart database in your browser…
Loading FreshCart DB…
Ctrl + Enter to run
Loading FreshCart database in your browser…

// Part 08

Ranking for Deduplication — Keep One Row Per Group

ROW_NUMBER is the cleanest way to deduplicate — keep exactly one row per group based on a defined priority. The pattern appears constantly in data engineering: keep the most recent record, keep the highest-value record, keep the first occurrence.

Loading FreshCart DB…
Ctrl + Enter to run
Loading FreshCart database in your browser…
Loading FreshCart DB…
Ctrl + Enter to run
Loading FreshCart database in your browser…

🎯 Pro Tip

When deduplicating, always add a tiebreaker to the ORDER BY inside ROW_NUMBER. If two rows have the same primary sort value (same order_date, same revenue), the one that gets rn = 1 is non-deterministic without a tiebreaker. Add a unique column (order_id, product_id) as the final ORDER BY key to ensure deterministic, reproducible results across query executions.

// Part 09

What This Looks Like at Work

You are a data analyst at Amazon. The seller success team runs a monthly seller health report: for each seller, their revenue rank within their category, their overall revenue rank, their percentile position, their quartile segment, and whether they moved up or down compared to the previous month. The ranking logic alone would take multiple queries without window functions — with them, it is one SELECT on two pre-aggregated CTEs.

2:00 PM
Requirements: seller health report with 6 ranking metrics
Category rank, overall rank, percentile, quartile, MoM rank movement, performance badge. Adapted for FreshCart stores.
Loading FreshCart DB…
Ctrl + Enter to run
Loading FreshCart database in your browser…
3:15 PM
Complete ranking report — delivered in 75 minutes
Six ranking metrics computed in a single SELECT over two CTEs. The seller success team has a complete, sortable report with movement arrows, percentile positions, and performance badges. Adding a seventh metric — say, a city-level rank — requires one more window function expression, not a new query.

🎯 Pro Tip

When building leaderboard reports, always compute rankings in a CTE layer rather than in the final SELECT. This avoids repeating the same RANK() OVER (...) expression multiple times (for the badge CASE and for display). Extract ranks into a CTE with columns rank_jan and rank_feb, then reference those columns in the final SELECT for comparisons and badges. The query runs faster (each OVER() is computed once) and is easier to read.

// Part 10

Interview Prep — 5 Questions With Complete Answers

Q: What is the difference between PERCENT_RANK and CUME_DIST?

Both express a row's position in the distribution as a fraction between 0 and 1, but they compute different things. PERCENT_RANK answers "what fraction of rows rank strictly below this row?" It is computed as (rank - 1) / (total_rows - 1). The lowest value always gets 0.0 (zero rows rank below it) and the highest always gets 1.0. Tied rows get the same PERCENT_RANK value. If two rows share the minimum value, both get PERCENT_RANK = 0.0.

CUME_DIST answers "what fraction of rows have a value less than or equal to this row's value?" It is computed as rows_with_value_leq_current / total_rows. CUME_DIST is always greater than zero — even the lowest-value rows have themselves counted, so the minimum is 1/total_rows. For the highest value, CUME_DIST = 1.0. All rows sharing the same value get the same CUME_DIST value, and it includes themselves in the count.

Practical implication: PERCENT_RANK is better for "is this row in the top X%?" comparisons — WHERE PERCENT_RANK() OVER (...) >= 0.8 finds the top 20% cleanly. CUME_DIST is useful when you want to know what proportion of the population has a value at or below a specific point — for example, "what fraction of orders are at or below ₹500?" Both are window functions requiring OVER() with ORDER BY — unlike PERCENTILE_CONT and PERCENTILE_DISC which are ordered-set aggregates that collapse rows like GROUP BY.

Q: When would you use NTILE vs PERCENT_RANK for segmentation?

NTILE(n) divides rows into exactly n buckets with approximately equal row counts — it is a row-count-based division. NTILE(4) assigns each row to quartile 1, 2, 3, or 4, where each quartile contains roughly 25% of the rows. The bucket assignment depends on row position, not value — two rows with very different values could both be in quartile 3 if they are adjacent in the ordered sequence.

PERCENT_RANK is value-based — it computes the row's fractional rank relative to the distribution of values. Filtering WHERE PERCENT_RANK() OVER (...) >= 0.75 finds the top 25% by value, not by count. If the top 5% of customers account for 60% of revenue (a right-skewed distribution), NTILE splits 25% of customers per bucket regardless of value distribution, while PERCENT_RANK >= 0.75 identifies customers whose spend puts them in the top 25% by value rank.

Choose NTILE for equal-count segments — marketing campaigns that need to reach exactly 25% of the customer base per segment, A/B test splits where equal group sizes matter. Choose PERCENT_RANK for value-based thresholds — "find customers above the 75th percentile of spending", "identify products in the bottom 10% of revenue", "classify orders by where they fall in the distribution". NTILE can produce unintuitive results when the data is skewed — a product that just barely makes the top quartile may be far below the actual 75th percentile value if most revenue is concentrated in the top few products. For business reporting where "top 25%" means value-based, PERCENT_RANK is more meaningful.

Q: How do you handle ties when ranking for deduplication?

When using ROW_NUMBER for deduplication (keeping exactly one row per group), ties in the ORDER BY column create non-deterministic behaviour — the database may assign rn = 1 to any of the tied rows, and the choice may change between query executions or database versions. This is a correctness problem: the same query run twice could return different rows, making results unreproducible.

The solution is always to add a tiebreaker column as the final ORDER BY key — a column that is guaranteed unique within the partition, typically the primary key. ROW_NUMBER() OVER (PARTITION BY customer_id ORDER BY order_date DESC, order_id DESC) — if two orders have the same order_date, the one with the higher order_id gets rn = 1. This ensures deterministic, reproducible results regardless of physical row order or internal optimiser decisions.

The tiebreaker column choice matters semantically: order_id DESC for the most recently inserted duplicate, created_at ASC for the original/oldest record. Document the tiebreaker logic — when someone asks "which record do you keep when there are duplicates?", the ORDER BY clause inside ROW_NUMBER should have a clear, business-meaningful answer. For RANK and DENSE_RANK (which are designed to express ties), no tiebreaker is needed — tied rows intentionally share the same rank. Only ROW_NUMBER requires a tiebreaker because its contract is to produce unique sequential integers.

Q: What is the difference between PERCENTILE_CONT and PERCENTILE_DISC?

Both compute the value at a specified percentile of a distribution, but they use different methods for non-integer positions. PERCENTILE_CONT uses linear interpolation — when the percentile falls between two actual data values, it returns the weighted average of those two values. PERCENTILE_DISC returns the nearest actual value in the dataset — no interpolation.

For the median (50th percentile) of [100, 200, 300, 400]: PERCENTILE_CONT(0.5) returns 250 (the midpoint of 200 and 300, interpolated). PERCENTILE_DISC(0.5) returns 200 or 300 (the actual data value nearest the 50th percentile position). For an odd number of values like [100, 200, 300]: both would return 200 since 200 is exactly at the 50th percentile.

Choose PERCENTILE_CONT for statistical analysis where mathematical precision matters — it follows the standard definition of percentile that most statisticians and data scientists expect. Choose PERCENTILE_DISC when you need the result to be an actual value from the dataset — for example, when selecting the median salary to identify a specific employee as the median earner. These are ordered-set aggregates (unlike the window function PERCENT_RANK) — they use WITHIN GROUP (ORDER BY column) syntax and collapse rows like GROUP BY. They do not have OVER() — they are called like standard aggregate functions: SELECT store_id, PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY total_amount) AS median FROM orders GROUP BY store_id.

Q: How would you build a leaderboard that shows rank movement from the previous period?

The pattern requires computing ranks for two time periods independently, then comparing them. Use two CTEs — one per period — each aggregating the relevant metric (revenue, order count, score) per entity. Then a final SELECT joins the two CTEs and applies RANK() to each period's metric with separate OVER() clauses. The rank movement is the previous period rank minus the current period rank — positive means improved (moved up), negative means declined (moved down), zero means unchanged.

Implementation: WITH current_period AS (SELECT entity_id, SUM(metric) AS score FROM ... WHERE date IN current_range GROUP BY entity_id), previous_period AS (SELECT entity_id, SUM(metric) AS score FROM ... WHERE date IN previous_range GROUP BY entity_id), combined AS (SELECT COALESCE(c.entity_id, p.entity_id) AS entity_id, COALESCE(c.score, 0) AS current_score, COALESCE(p.score, 0) AS prev_score FROM current_period c FULL OUTER JOIN previous_period p ON c.entity_id = p.entity_id) SELECT *, RANK() OVER (ORDER BY current_score DESC) AS current_rank, RANK() OVER (ORDER BY prev_score DESC) AS prev_rank, RANK() OVER (ORDER BY prev_score DESC) - RANK() OVER (ORDER BY current_score DESC) AS rank_change FROM combined.

Two important details: use FULL OUTER JOIN between periods to include entities that appear in one period but not the other (new entrants have no previous rank; churned entities have no current rank). Use COALESCE(score, 0) so new entrants get rank assigned based on zero score rather than being excluded. For the movement indicator display, convert rank_change to a symbol: CASE WHEN rank_change > 0 THEN '▲ ' || rank_change WHEN rank_change < 0 THEN '▼ ' || ABS(rank_change) ELSE '—' END. For entities that did not exist in the previous period, show 'NEW' instead of a movement arrow. The FULL OUTER JOIN naturally handles this — entities with no previous record have prev_rank based on zero score.

// Part 11

Errors You Will Hit — And Exactly Why They Happen

PERCENT_RANK returns 0.0 for all rows — expected a spread of values

Cause: The OVER() clause is missing ORDER BY or has an ORDER BY with all identical values. PERCENT_RANK requires ORDER BY to define the ranking — without it, all rows are considered tied at the same rank, making (rank - 1) / (total - 1) = 0 for all rows. Alternatively, all rows genuinely have the same value in the ORDER BY column, making all ranks equal.

Fix: Add ORDER BY to the OVER clause: PERCENT_RANK() OVER (ORDER BY total_amount). Verify that the ORDER BY column has distinct values — check SELECT COUNT(DISTINCT total_amount) FROM table. If all values truly are identical, PERCENT_RANK correctly returns 0.0 for all rows (all tied at rank 1). If ORDER BY is present and values vary but PERCENT_RANK still shows 0.0, check if PARTITION BY has created single-row partitions — a partition of one row always has PERCENT_RANK = 0.0.

NTILE produces unequal bucket sizes — expected exactly 25% per quartile

Cause: NTILE distributes rows as evenly as possible, but when total rows is not evenly divisible by n, some buckets get one extra row. NTILE(4) on 10 rows produces buckets of sizes 3, 3, 2, 2 — not all size 2 or 3. This is correct behaviour — NTILE makes no guarantee of exact equal sizes, only approximately equal sizes.

Fix: This is expected and correct — document it to stakeholders. The standard definition of NTILE distributes the remainder rows to the first buckets. To get exact equal-count segments regardless of remainder, use PERCENT_RANK for value-based thresholds instead: WHERE PERCENT_RANK() OVER (ORDER BY col) >= 0.75 for the top 25%. This is value-based (not row-count-based) but conceptually matches what most people mean by 'top quartile'. If exact row counts per bucket are required, calculate the target rows per bucket (FLOOR(COUNT(*) / n)) and use ROW_NUMBER with arithmetic to assign buckets manually.

ROW_NUMBER deduplication returns different rows on different executions — non-deterministic

Cause: The ORDER BY inside ROW_NUMBER() OVER (PARTITION BY ... ORDER BY ...) has ties — multiple rows share the same ORDER BY value. When two rows have identical sort values, the database can assign either one as rn = 1, and this choice may differ between executions, server restarts, or after updates to table statistics. The result is non-deterministic deduplication.

Fix: Add a unique column as the final tiebreaker in the ORDER BY: ROW_NUMBER() OVER (PARTITION BY customer_id ORDER BY order_date DESC, order_id DESC). The order_id is unique, so no two rows will have identical (order_date, order_id) combinations. The tiebreaker guarantees deterministic assignment — the row with the highest order_id among those with the same order_date always gets rn = 1. Document which tiebreaker was chosen and why — it represents a business decision about which duplicate to keep.

PERCENTILE_CONT syntax error — cannot use OVER() with WITHIN GROUP

Cause: PERCENTILE_CONT and PERCENTILE_DISC are ordered-set aggregate functions, not window functions. They use a completely different syntax: PERCENTILE_CONT(fraction) WITHIN GROUP (ORDER BY column). Attempting to add OVER() to them (PERCENTILE_CONT(0.5) OVER (PARTITION BY ...)) is a syntax error — they do not support the window function OVER() clause.

Fix: Use the correct ordered-set aggregate syntax with GROUP BY for per-group percentiles: SELECT store_id, PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY total_amount) AS median FROM orders GROUP BY store_id. For a per-row percentile rank (window function equivalent), use PERCENT_RANK() OVER (ORDER BY col) instead — this is the window function version that works with OVER() and PARTITION BY. The two serve different purposes: PERCENTILE_CONT returns the VALUE at a percentile; PERCENT_RANK returns the RANK of a row as a fraction.

Multi-level ranking produces wrong results — inner rank affected by outer partition

Cause: When computing ranks at multiple levels (store rank within city AND city rank overall), using a single PARTITION BY that mixes both levels can produce incorrect results. For example, RANK() OVER (PARTITION BY city ORDER BY city_total_revenue DESC) when applied to a row-level table assigns ranks based on repeated city_total_revenue values — every store in the same city gets city_total_revenue rank = 1 because they all share the same value.

Fix: Compute each ranking level independently in its own CTE. CTE 1: aggregate to store level and compute store_rank OVER (PARTITION BY city ORDER BY store_revenue DESC). CTE 2: aggregate to city level and compute city_rank OVER (ORDER BY city_revenue DESC). CTE 3: join the two CTEs to produce the combined result. Never attempt to compute city-level ranks in a query that still has store-level granularity — the duplicate city values in every store row will produce incorrect rank assignments.

Try It Yourself

Build a comprehensive product ranking report using multiple ranking functions. The report should show for each product: product_id, product_name, category, unit_price, margin_pct (rounded to 1dp), total revenue from delivered orders (0 if none), times_ordered (distinct order count), and the following rankings: (1) price_rank — RANK by unit_price DESC within category, (2) revenue_rank — RANK by total revenue DESC within category, (3) overall_revenue_rank — DENSE_RANK by total revenue DESC across all products, (4) revenue_percentile — PERCENT_RANK by revenue across all products (as a 0–100 value rounded to 1dp), (5) revenue_quartile — NTILE(4) by revenue across all products (Q4 = highest), (6) a performance_tier: 'Elite' if both price_rank and revenue_rank are <= 2, 'Revenue Leader' if revenue_rank = 1, 'Premium Priced' if price_rank = 1, 'Standard' otherwise. Sort by category then revenue_rank.

🎯 Key Takeaways

  • PERCENT_RANK: fractional rank from 0.0 (lowest) to 1.0 (highest) — fraction of rows ranking strictly below. Formula: (rank - 1) / (total_rows - 1). Use WHERE PERCENT_RANK() >= 0.8 to find top 20%.
  • CUME_DIST: cumulative distribution — fraction of rows with value ≤ current row. Always > 0, up to 1.0. Differs from PERCENT_RANK by including the current row in the count.
  • PERCENTILE_CONT(f) WITHIN GROUP (ORDER BY col): returns the interpolated value at fraction f of the distribution. Uses GROUP BY syntax, not OVER(). PERCENTILE_DISC returns the nearest actual value from the dataset.
  • NTILE(n): equal-count buckets. When rows do not divide evenly by n, early buckets get one extra row. Use for equal-sized segments; use PERCENT_RANK for value-based thresholds.
  • Conditional ranking: wrap the window function in a CASE — return the rank when the condition is true, NULL otherwise. Or pre-filter in a CTE and apply the ranking to the filtered subset.
  • Multi-level rankings (store within city, city overall) must be computed in separate CTE layers at their respective granularities. Never compute city-level ranks in a store-level query — duplicate values produce incorrect rank assignments.
  • ROW_NUMBER deduplication: always add a unique tiebreaker column as the final ORDER BY key. Without a tiebreaker, ties produce non-deterministic assignment of rn = 1 — different rows may be kept on different executions.
  • Leaderboard with movement: FULL OUTER JOIN two period CTEs, compute RANK() OVER (...) on each period's metric independently, subtract to get rank_change. Positive = improved, negative = declined.
  • Pre-compute ranks in a CTE when the same RANK() expression appears in multiple CASE branches — the CTE computes it once, the outer SELECT references the named column without repeated computation.
  • PERCENT_RANK and CUME_DIST require ORDER BY in OVER(). PERCENTILE_CONT and PERCENTILE_DISC use WITHIN GROUP (ORDER BY) and are aggregate functions — no OVER() clause.

What comes next

In Module 54, you learn Analytics with LAG and LEAD — every period-over-period pattern, retention analysis, funnel drop-off, session gap detection, and time-series analytics built on offset window functions.

Module 54 → Analytics with LAG and LEAD
Share

Discussion

0

Have a better approach? Found something outdated? Share it — your knowledge helps everyone learning here.

Continue with GitHub
Loading...