The SQL CASE WHEN expression — also called the case statement in SQL or simply the SQL CASE statement — is the database equivalent of an if-else chain: it evaluates a list of conditions in order, returns the value tied to the first condition that is true, and falls back to ELSE (or NULL) when nothing matches. Almost every analytics SQL query that does anything interesting — letter grades, success rates, paid-vs-refunded revenue, pivot reports, custom sort priorities — eventually reaches for CASE WHEN, which is why it shows up in nearly every screening round under the sql interview questions umbrella.
This guide walks through the two forms of CASE (searched vs simple), every clause where it can appear (SELECT, WHERE, GROUP BY, ORDER BY, HAVING), the conditional aggregation pattern (sum case when sql) that powers half of all data engineering interview questions, the rows-to-columns pivot trick reviewers love, the four NULL traps that fail most candidates, and dialect-specific shortcuts like Postgres FILTER, SQL Server IIF, and Oracle DECODE. Every section ends as sql interview questions with answers: a runnable PostgreSQL query, a traced execution, an output table, and a concept-by-concept why this works breakdown — the exact shape sql for data engineers rounds reward. Whether you reach sql for data engineering prompts on screening calls or sit down to a multi-hour onsite, using case statements in SQL correctly is the single highest-leverage SQL skill you can lock in.
When you want hands-on reps immediately after reading, browse CASE WHEN SQL practice →, drill the conditional logic lane →, sharpen conditional aggregation patterns →, rehearse aggregation SQL drills →, or widen coverage on the full SQL practice library →.
On this page
- Why CASE WHEN matters in data engineering interviews
- Anatomy — searched form vs simple form, ELSE, and NULL behavior
- CASE in SELECT — bucketing, tier labels, and derived columns
- Conditional aggregation — SUM(CASE WHEN …) the canonical interview pattern
- Pivot rows to columns with conditional aggregation
- CASE in WHERE, GROUP BY, ORDER BY, and HAVING
- CASE with NULLIF and COALESCE — safe arithmetic patterns
- Dialect alternatives — FILTER, IIF, DECODE, IF
- Choosing where to put CASE (cheat sheet)
- Frequently asked questions
- Practice on PipeCode
1. Why CASE WHEN matters in data engineering interviews
CASE WHEN is SQL's if-else — the single most versatile conditional primitive
The one-sentence invariant interviewers chase: a SQL CASE WHEN expression is the SQL equivalent of an if-else chain — it evaluates a list of conditions in order and returns the value tied to the first condition that is TRUE, or ELSE (or NULL) if none match. Internalise that and you can lift a third of the SQL interview prompts that look "tricky" — letter grades, success rates, refund accounting, pivot reports — into one expression.
What CASE WHEN does at a glance.
-
Definition — a conditional branching expression that evaluates predicates top-to-bottom and returns the value of the first predicate that is
TRUE. -
Required syntax —
CASE [<expr>] WHEN <pred> THEN <result> [WHEN …] [ELSE <default>] END. - Return type — any scalar value (number, string, date, boolean); branches must agree on a compatible type.
-
Short-circuit evaluation — once a
WHENmatches, the rest are skipped; later predicates never even run.
Two forms ship in ANSI SQL.
-
Searched form (the one you use 95% of the time) — each
WHENholds an arbitrary boolean predicate; full flexibility for ranges,IS NULL, compound predicates. -
Simple form —
CASE <expr> WHEN <value1> THEN …compares<expr>to eachWHEN <value>for equality; cannot express ranges orIS NULL. - ANSI SQL-92 defined both forms; every major dialect (PostgreSQL, MySQL, SQL Server, Oracle, Snowflake, BigQuery) supports them.
Where CASE WHEN shows up in a query.
-
SELECTprojection — derive a new column (bucketing, tier label, computed metric). -
Inside an aggregate —
SUM,COUNT,AVG,MIN,MAX— the conditional aggregation pattern (§4). -
WHEREclause — rare; usually refactorable toIN/OR. -
GROUP BY— group by a tier label or computed bucket. -
HAVING— filter on a conditional aggregate. -
ORDER BY— custom sort priority ("active first, paused middle, churned last"). -
DML statements —
INSERT,UPDATE,DELETEuse CASE to set conditional values. -
Nested inside another
CASE— for hierarchical logic (role + tenure bonus rules).
The four canonical interview patterns powered by CASE.
- Bucketing / tier labels — letter grades, risk bands, age cohorts, RFM segments.
-
Conditional aggregation —
SUM(CASE WHEN cond THEN val ELSE 0 END)for multi-KPI single-pass reports. -
Pivot rows to columns — one
SUM(CASE WHEN category = 'x' THEN val ELSE 0 END)per output column. -
Custom sort priority —
ORDER BY CASE WHEN status = 'urgent' THEN 0 WHEN status = 'high' THEN 1 ELSE 2 END.
What interviewers listen for.
- Do you include
ELSE 0in everySUM(CASE …)? — signals you understand the silent-undercount NULL trap (§2). - Do you choose the searched form over the simple form when ranges are involved? — signals fluency.
- Do you know that
CASE WHEN x = NULLnever matches and useIS NULLinstead? — senior signal. - Do you mention the Postgres
FILTERclause when working on a Postgres-flavoured prompt? — bonus points.
Worked example — bucket employee salaries into tier labels
Question. Add a salary_tier column to the result showing Junior (under 60k), Mid (60k–99,999), Senior (100k–149,999), and Executive (150k or more).
Input. employees(emp_id, name, salary).
| emp_id | name | salary |
|---|---|---|
| 1 | Ava | 55,000 |
| 2 | Omar | 78,500 |
| 3 | Lin | 130,400 |
| 4 | Sam | 165,000 |
Code.
SELECT emp_id,
name,
salary,
CASE
WHEN salary >= 150000 THEN 'Executive'
WHEN salary >= 100000 THEN 'Senior'
WHEN salary >= 60000 THEN 'Mid'
ELSE 'Junior'
END AS salary_tier
FROM employees
ORDER BY salary DESC;
Step-by-step explanation.
- The planner evaluates the
CASEexpression once per row. - Branches are tested top to bottom; the first true
WHENdecides the result. - Sam's 165k hits the first branch (
>= 150000) and becomesExecutive; the rest of the branches are skipped. - Lin's 130k fails the first branch but hits
>= 100000→Senior. - Ava's 55k falls through every
WHENand lands in theELSE→Junior.
Output.
| emp_id | name | salary | salary_tier |
|---|---|---|---|
| 4 | Sam | 165,000 | Executive |
| 3 | Lin | 130,400 | Senior |
| 2 | Omar | 78,500 | Mid |
| 1 | Ava | 55,000 | Junior |
Rule of thumb: order your WHEN branches from most restrictive (highest threshold) to least restrictive so the first match always lands in the intended tier.
SQL
Topic — case-when
CASE WHEN SQL problems
SQL
Topic — conditional-logic
Conditional-logic SQL drills
2. Anatomy — searched form vs simple form, ELSE, and NULL behavior
CASE ships in two forms — pick the searched form for anything beyond simple equality
ANSI SQL-92 defined two CASE forms: the searched form uses arbitrary boolean predicates per WHEN, while the simple form compares a single expression against constant values for equality. The searched form is strictly more expressive — pick it by default and only fall back to the simple form when the equality-only shape genuinely reads cleaner.
Searched form — the one you use 95% of the time.
-
Shape —
CASE WHEN <bool_expr_1> THEN <result_1> [WHEN <bool_expr_2> THEN <result_2>] … [ELSE <default>] END. -
Predicates — any boolean expression:
>=,<,BETWEEN,IS NULL,IN, compoundAND/OR. - Use it when — branches need ranges, NULL checks, multi-column logic, or nested function calls.
-
Reads like an
if-elsechain — interviewers comprehend it instantly.
Simple form — equality only.
-
Shape —
CASE <expr> WHEN <value_1> THEN <result_1> [WHEN <value_2> THEN <result_2>] … [ELSE <default>] END. -
Predicates — equality only:
<expr> = <value_n>per branch. -
Cannot express — ranges,
IS NULL(use searched form for these). -
Reads cleaner when — the input is a stable enum-like column (
status,country_code) and everyWHENis a single literal.
The ELSE clause.
-
Optional — if omitted, a no-match returns
NULL. -
Mandatory in
SUM(CASE …)— withoutELSE 0, false-branch rows returnNULL, whichSUMignores, causing a silent undercount (the canonical interview trap). -
Type compatibility —
ELSEmust return a value compatible with theTHENbranches; mixing types triggers implicit casts. -
Default literal —
'Other',0, or a sentinel that downstream consumers can safely filter.
The four NULL traps (the canonical interview gotchas).
-
Missing
ELSE+SUM—SUM(CASE WHEN cond THEN x END)returnsNULLfor the false branch;SUMignoresNULL; your total is silently undercounted. Fix: always includeELSE 0. -
CASE WHEN x = NULLnever matches —NULL = anythingisUNKNOWN. Fix: useCASE WHEN x IS NULL THEN …. -
Mixed types in branches force implicit casts — branches returning a mix of
INTEGER,DECIMAL,VARCHARcan produce surprises. Fix: cast each branch explicitly to a common type. -
Divide-by-zero via NULL propagation —
revenue / costcrashes whencost = 0. Fix: wrap withNULLIF:revenue / NULLIF(cost, 0)(returnsNULLinstead of raising).
Logical processing order — where does CASE sit?
- The SQL clauses execute in this order regardless of the order you write them:
-
FROM/JOIN→WHERE→GROUP BY→HAVING→ window functions →SELECT(with CASE) →ORDER BY.
-
-
Consequence #1 — a
CASEalias inSELECTcannot be referenced in the same query'sWHERE,GROUP BY, orHAVING(the alias does not yet exist). -
Consequence #2 — to filter on a
CASEresult, wrap the query in a CTE and filter the outer scope. -
Consequence #3 — outer
ORDER BYcan reference theCASEalias becauseORDER BYruns last.
Worked example — searched vs simple form on the same input
Question. Show every order's status with two derived columns: one using the searched form for an "order priority" and one using the simple form for a "status label".
Input. orders(order_id, status).
| order_id | status |
|---|---|
| 1001 | paid |
| 1002 | refunded |
| 1003 | cancelled |
| 1004 | pending |
Code.
SELECT order_id,
status,
CASE
WHEN status IN ('paid', 'shipped') THEN 'fulfilled'
WHEN status IN ('refunded', 'cancelled') THEN 'reversed'
ELSE 'in_progress'
END AS searched_priority,
CASE status
WHEN 'paid' THEN '✓ paid'
WHEN 'refunded' THEN '↩ refunded'
WHEN 'cancelled' THEN '✗ cancelled'
ELSE '… pending'
END AS simple_label
FROM orders
ORDER BY order_id;
Output.
| order_id | status | searched_priority | simple_label |
|---|---|---|---|
| 1001 | paid | fulfilled | ✓ paid |
| 1002 | refunded | reversed | ↩ refunded |
| 1003 | cancelled | reversed | ✗ cancelled |
| 1004 | pending | in_progress | … pending |
Rule of thumb: if your branches use = against a single column, the simple form is shorter; the moment ranges, IS NULL, or compound predicates enter the picture, switch to the searched form.
SQL
Topic — case-expression
CASE expression SQL problems
SQL
Topic — conditional-logic
Conditional logic SQL drills
3. CASE in SELECT — bucketing, tier labels, and derived columns
The bucketing pattern — letter grades, risk bands, RFM segments, NPS buckets
The single most-used CASE pattern is bucketing: take a continuous metric (score, revenue, age) and assign each row to a discrete tier. Letter grades, customer risk bands, RFM segments, NPS buckets, age cohorts, salary tiers, and tenure ranges all share the same shape.
The bucketing pattern at a glance.
-
Shape —
CASE WHEN metric >= t1 THEN 'tier_1' WHEN metric >= t2 THEN 'tier_2' … ELSE 'tier_last' END. -
Top-to-bottom evaluation — first true
WHENdecides the tier; later branches are skipped. -
Branches can overlap intentionally —
WHEN salary >= 150000matches Sam too, butWHEN salary >= 100000is checked only if the previous branch failed, so Sam still lands inExecutive. - Use it for — letter grades (A/B/C/D/F), customer cohorts (high/mid/low value), age bands, salary tiers, RFM segments, NPS buckets (promoter/passive/detractor), risk ratings.
Boundary rules — >= vs > decides who lands in each tier.
-
>=is inclusive at the lower bound;>is exclusive. - State your boundaries aloud — "A is 90+ inclusive, B is 80–89, C is 70–79".
-
Half-open intervals are cleanest —
WHEN score >= 90thenWHEN score >= 80(each implicitly covers up to but not including the next tier's lower bound thanks to top-down evaluation). -
Avoid double-counting — closed intervals like
WHEN score BETWEEN 80 AND 89are correct but verbose; the half-open>=chain is the standard pattern.
Aliasing the derived column.
-
Use
AS <name>—CASE … END AS letter_grademakes the output column self-documenting. -
Reference the alias in
ORDER BY—ORDER BY letter_grade DESCworks (sinceORDER BYruns afterSELECT). -
Cannot reference in same query's
WHERE— wrap with a CTE if you need to filter on the tier. -
Same alias rules apply for
GROUP BYandHAVING— wrap in a CTE or repeat the full CASE expression.
Composing CASE with other functions.
-
String manipulation —
LOWER(CASE WHEN status = 'PAID' THEN 'paid_revenue' ELSE 'other' END). -
Numeric operations —
CASE WHEN x > 0 THEN LOG(x) ELSE NULL ENDfor safe logarithms. -
Date arithmetic —
CASE WHEN sale_date >= CURRENT_DATE - INTERVAL '30 days' THEN 'recent' ELSE 'old' END. - Inside aggregates — see §4 for the conditional aggregation pattern.
Worked example — letter grades from exam scores
Question. Add a letter_grade column for each student based on their score: 90+ is A, 80–89 is B, 70–79 is C, 60–69 is D, below 60 is F.
Input. exam_results(student_id, name, score).
| student_id | name | score |
|---|---|---|
| 1 | Ava | 95 |
| 2 | Omar | 82 |
| 3 | Lin | 71 |
| 4 | Sam | 58 |
Code.
SELECT student_id,
name,
score,
CASE
WHEN score >= 90 THEN 'A'
WHEN score >= 80 THEN 'B'
WHEN score >= 70 THEN 'C'
WHEN score >= 60 THEN 'D'
ELSE 'F'
END AS letter_grade
FROM exam_results
ORDER BY score DESC;
Output.
| student_id | name | score | letter_grade |
|---|---|---|---|
| 1 | Ava | 95 | A |
| 2 | Omar | 82 | B |
| 3 | Lin | 71 | C |
| 4 | Sam | 58 | F |
SQL interview question — categorize employees into salary tiers and count headcount per tier
Assume employees(emp_id, name, salary). Return one row per salary tier (Junior, Mid, Senior, Executive) with the headcount, sorted from highest tier to lowest.
Solution Using CASE in SELECT + GROUP BY tier
Code solution.
WITH labelled AS (
SELECT emp_id,
name,
salary,
CASE
WHEN salary >= 150000 THEN 'Executive'
WHEN salary >= 100000 THEN 'Senior'
WHEN salary >= 60000 THEN 'Mid'
ELSE 'Junior'
END AS tier
FROM employees
)
SELECT tier,
COUNT(*) AS headcount
FROM labelled
GROUP BY tier
ORDER BY CASE tier
WHEN 'Executive' THEN 0
WHEN 'Senior' THEN 1
WHEN 'Mid' THEN 2
ELSE 3
END;
Step-by-step trace.
| step | relation | outcome |
|---|---|---|
| 1 | Scan employees and project the tier label inside the CTE |
every row now carries a tier name |
| 2 |
GROUP BY tier in the outer query |
collapses to one row per tier |
| 3 |
COUNT(*) counts rows per tier |
headcount column produced |
| 4 | Outer ORDER BY CASE tier WHEN 'Executive' THEN 0 … END
|
enforces highest-tier-first sort (alphabetical sort would put Executive last) |
Output:
| tier | headcount |
|---|---|
| Executive | 4 |
| Senior | 11 |
| Mid | 27 |
| Junior | 18 |
Why this works — concept by concept:
-
CASE inside SELECT — projects a derived
tiercolumn based on the salary buckets; top-to-bottom evaluation guarantees that "Executive" wins for salary 165k even though "Senior" would also match>= 100000. -
CTE wrapper — the inner
WITH labelled AS (…)materialises the labelled rows so the outerGROUP BY tiercan reference the alias; without the CTE the same query would need to repeat the entire CASE expression inGROUP BY. -
GROUP BY tier — collapses to one row per distinct tier value;
COUNT(*)counts rows in each group. -
ORDER BY CASE — alphabetic sort would put
ExecutiveafterJuniorandMid; a CASE-based sort key enforces the business order without adding an extra rank column. -
Cost — single sequential scan of
employeesfor the CTE (Θ(n)), hash aggregate on the CTE output for the outerGROUP BY(Θ(n)).
SQL
Topic — case-when
Bucketing / tier-label drills
SQL
Topic — aggregation
Aggregation + CASE combos
4. Conditional aggregation — SUM(CASE WHEN …) the canonical interview pattern
Wrap CASE inside SUM, COUNT, AVG, MIN, MAX to compute multiple KPIs in one grouped pass
The sum case when sql pattern — SUM(CASE WHEN cond THEN val ELSE 0 END) — is the single most-asked CASE WHEN technique in data engineering interviews and the canonical form of conditional aggregation. It lets you compute several KPIs in one grouped query instead of running multiple filtered queries, joining them, or building temp tables.
The canonical pattern at a glance.
-
Shape —
SUM(CASE WHEN <cond> THEN <value> ELSE 0 END) AS <metric>. -
Reads as — "sum
<value>for rows that satisfy<cond>, contribute0otherwise." - Replaces — multiple filtered queries, joins, temp tables, or PIVOT operators.
-
Pairs with
GROUP BYfor per-segment metrics — one CASE per metric column.
Counting with SUM(CASE WHEN cond THEN 1 ELSE 0 END).
-
Counts rows matching a predicate — equivalent to
COUNT(*) FILTER (WHERE cond)in Postgres or a separate filtered subquery. -
THEN 1— contributes one to the count per matching row. -
ELSE 0— non-matching rows contribute zero (and are NOT counted). -
Alternative —
COUNT(CASE WHEN cond THEN 1 END)works too (becauseCOUNT(col)ignoresNULL), butSUM(CASE WHEN cond THEN 1 ELSE 0 END)is more explicit.
Why one query beats many.
- One sequential scan of the source table → multiple KPIs side by side; reduces I/O.
-
Consistent grain — every metric is computed at the same
GROUP BYlevel, so they sit in the same row. - No extra joins or temp tables — saves planner cost and review effort.
-
Refactor-friendly — adding a new KPI is one extra
SUM(CASE …)column, not a new query. - Dashboard-ready — one row per group with all the metrics is exactly what BI tools want.
The ELSE 0 rule — the canonical interview trap.
-
Without
ELSE 0—CASE WHEN cond THEN x ENDreturnsNULLfor the false branch. -
SUMignoresNULL— so missing-ELSE rows silently disappear from the sum. - The bug is invisible — the query "works" and returns a number that looks plausible; only auditing against a separate filtered query catches the undercount.
-
Always write
ELSE 0forSUM— even when you think it cannot matter, write it; reviewers expect it.
Combining conditional sums with COUNT(*) for ratios.
-
Success rate —
SUM(CASE WHEN succeeded THEN 1 ELSE 0 END) * 1.0 / COUNT(*)(the* 1.0forces float division in MySQL / older Postgres). -
Refund rate —
SUM(CASE WHEN status = 'refund' THEN 1 ELSE 0 END) * 1.0 / COUNT(*). -
Click-through rate —
SUM(CASE WHEN clicked THEN 1 ELSE 0 END) * 1.0 / SUM(CASE WHEN impressed THEN 1 ELSE 0 END). -
Always guard the denominator with
NULLIFfor safety (§7).
Worked example — paid vs refunded revenue per day
Question. From orders(order_id, order_ts, status, amount), compute per day: total_revenue, paid_revenue, refunded_amount, and paid_orders count — all in one grouped query.
Input. orders slice.
| order_id | order_ts | status | amount |
|---|---|---|---|
| 1 | 2026-01-01 | paid | 100 |
| 2 | 2026-01-01 | paid | 150 |
| 3 | 2026-01-01 | refund | 50 |
| 4 | 2026-01-02 | paid | 200 |
| 5 | 2026-01-02 | refund | 30 |
Code.
SELECT DATE(order_ts) AS order_date,
SUM(amount) AS total_amount,
SUM(CASE WHEN status = 'paid' THEN amount ELSE 0 END) AS paid_revenue,
SUM(CASE WHEN status = 'refund' THEN amount ELSE 0 END) AS refunded_amount,
SUM(CASE WHEN status = 'paid' THEN 1 ELSE 0 END) AS paid_orders
FROM orders
GROUP BY DATE(order_ts)
ORDER BY 1;
Step-by-step explanation.
-
DATE(order_ts)truncates the timestamp to a calendar day for grouping. -
SUM(amount)aggregates every row regardless of status — gives gross volume per day. - The four
SUM(CASE WHEN …)columns filter rows by status before contributing to each KPI. -
paid_ordersusesTHEN 1to count rows instead of summing amounts. -
GROUP BY DATE(order_ts)collapses to one row per day; every KPI shares the sameorder_dategrain.
Output.
| order_date | total_amount | paid_revenue | refunded_amount | paid_orders |
|---|---|---|---|---|
| 2026-01-01 | 300 | 250 | 50 | 2 |
| 2026-01-02 | 230 | 200 | 30 | 1 |
SQL interview question — paid vs refunded revenue per region in one query
Assume orders(order_id, region, status, amount). Return per region: total_revenue, paid_revenue, refunded_amount, and the count of paid_orders — all in one grouped query.
Solution Using SUM(CASE WHEN …)
Code solution.
SELECT region,
SUM(amount) AS total_revenue,
SUM(CASE WHEN status = 'paid' THEN amount ELSE 0 END) AS paid_revenue,
SUM(CASE WHEN status = 'refund' THEN amount ELSE 0 END) AS refunded_amount,
SUM(CASE WHEN status = 'paid' THEN 1 ELSE 0 END) AS paid_orders
FROM orders
GROUP BY region
ORDER BY paid_revenue DESC;
Step-by-step trace.
| step | relation | outcome |
|---|---|---|
| 1 | Scan orders once |
every order row evaluated |
| 2 | GROUP BY region |
rows grouped by region; planner uses hash aggregate |
| 3 | Four aggregate calls evaluated per group | each SUM(CASE …) independently filters and sums |
| 4 |
paid_orders uses THEN 1 ELSE 0
|
counts paid rows per region without a separate query |
| 5 | Outer ORDER BY paid_revenue DESC
|
best-performing regions first |
Output:
| region | total_revenue | paid_revenue | refunded_amount | paid_orders |
|---|---|---|---|---|
| US | 12,500 | 11,200 | 1,300 | 142 |
| EU | 8,750 | 7,900 | 850 | 98 |
| APAC | 5,400 | 5,100 | 300 | 64 |
Why this works — concept by concept:
-
Conditional aggregation —
SUM(CASE WHEN status = 'paid' THEN amount ELSE 0 END)sums only the rows where the predicate is true; non-matching rows contribute zero, preserving the regional grain. -
The
ELSE 0discipline — without it, refunded rows would contributeNULL,SUMwould silently ignore them, andpaid_revenuewould still look correct — buttotal_revenue - paid_revenue - refunded_amountwould not balance. Always writeELSE 0. - One scan, four KPIs — a naive solution would write four filtered queries (or four joined subqueries); this single grouped query reads the table once and emits all four metrics side by side.
-
Counting with
THEN 1 ELSE 0— converts the conditional sum into a conditional count without a separateCOUNT(*) FILTER (WHERE …)(which is Postgres-only). -
Cost — single sequential scan
Θ(n)plus a hash aggregate keyed onregion(Θ(n)); the four CASE expressions add a constant factor per row.
SQL
Topic — conditional-aggregation
Conditional-aggregation drills
SQL
Topic — aggregation
Aggregation SQL library
5. Pivot rows to columns with conditional aggregation
One SUM(CASE WHEN ...) per output column — portable pivot without PIVOT
When a stakeholder wants "revenue by channel as columns instead of rows", reach for the conditional-aggregation pivot. The pattern produces one row per group with a fixed set of derived columns — far more portable than dialect-specific PIVOT operators.
The pattern at a glance.
-
Shape — one
SUM(CASE WHEN <category_col> = '<value>' THEN <metric> ELSE 0 END) AS <metric>_<value>per output column. -
One row per group — driven by
GROUP BYon the dimension you want as rows. - Stable column names — derived from the literal predicates, so the schema is known at write-time.
-
Replaces — SQL Server's
PIVOT, Postgrescrosstab, multiple joined subqueries.
Why this beats native PIVOT.
-
Portability — works on every dialect that supports
CASE(which is every major dialect). -
Readable — the predicates document the column meanings inline (
channel = 'paid' → rev_paid). -
Composable — easily extends with new metrics by adding more
SUM(CASE …)columns. -
No special grammar — junior reviewers can follow it without learning
PIVOT.
Stable column names.
- Hard-coded in the SQL — the column list is whatever you write; not dependent on the data.
-
Renames are explicit — change
rev_paidtopaid_revenuein one place. - Auditable — the predicate appears in the same expression as the column name, so reviewers can verify each KPI in isolation.
Limitations.
- Static column list — the columns must be known when you write the SQL. For dynamic pivots (where the categories themselves are unknown), reach for procedural SQL, application-side string interpolation, or dialect-specific dynamic SQL.
-
Many categories = wide query — 50 channels means 50
SUM(CASE …)columns; readability suffers. Consider aGROUP BY categoryshape instead. -
Empty groups still appear — an
ELSE 0ensures empty cells render as0rather thanNULL.
Use cases.
-
Campaign performance by channel —
rev_paid/rev_organic/rev_socialcolumns. -
Revenue by tier —
rev_free/rev_basic/rev_pro/rev_enterprise. -
Conversions by funnel step —
step_1_views/step_2_clicks/step_3_purchases. -
NPS distribution —
promoter_count/passive_count/detractor_countper cohort.
SQL interview question — pivot campaign performance by channel
Assume sessions(campaign_id, channel, converted, revenue). Return per campaign_id: total sessions, sessions split by channel (paid / organic / social), total conversions, and revenue split by channel — all in one grouped query.
Solution Using SUM(CASE WHEN channel = …)
Code solution.
SELECT campaign_id,
COUNT(*) AS sessions_total,
SUM(CASE WHEN channel = 'paid' THEN 1 ELSE 0 END) AS sessions_paid,
SUM(CASE WHEN channel = 'organic' THEN 1 ELSE 0 END) AS sessions_organic,
SUM(CASE WHEN channel = 'social' THEN 1 ELSE 0 END) AS sessions_social,
SUM(CASE WHEN converted = TRUE THEN 1 ELSE 0 END) AS conversions_total,
SUM(CASE WHEN channel = 'paid' THEN revenue ELSE 0 END) AS rev_paid,
SUM(CASE WHEN channel = 'organic' THEN revenue ELSE 0 END) AS rev_organic,
SUM(CASE WHEN channel = 'social' THEN revenue ELSE 0 END) AS rev_social
FROM sessions
GROUP BY campaign_id
ORDER BY (SUM(CASE WHEN channel = 'paid' THEN revenue ELSE 0 END)) DESC;
Step-by-step trace.
| step | relation | outcome |
|---|---|---|
| 1 | Scan sessions once |
every row evaluated |
| 2 | GROUP BY campaign_id |
hash aggregate, one row per campaign |
| 3 | 8 aggregate calls evaluated per group | each KPI computed in the same pass |
| 4 | Channel SUM(CASE WHEN channel = … THEN 1 ELSE 0) columns |
session counts pivoted to columns |
| 5 | Channel SUM(CASE WHEN channel = … THEN revenue ELSE 0) columns |
revenue pivoted to columns |
| 6 | Outer ORDER BY on a derived expression |
sort by paid-channel revenue without referencing the alias |
Output:
| campaign_id | sessions_total | sessions_paid | sessions_organic | sessions_social | conversions_total | rev_paid | rev_organic | rev_social |
|---|---|---|---|---|---|---|---|---|
| C-001 | 1,250 | 800 | 350 | 100 | 142 | 12,500 | 4,200 | 1,100 |
| C-002 | 980 | 420 | 510 | 50 | 89 | 6,800 | 6,100 | 540 |
Why this works — concept by concept:
- Pivot via conditional sums — each channel becomes its own column without any dialect-specific PIVOT operator; the column names are derived from the literal predicates and are stable across runs.
-
One scan, all KPIs — the planner reads
sessionsonce and computes every aggregate in parallel using the same hash aggregate node; the alternative would be 4–8 joined subqueries. -
THEN 1for counts,THEN revenuefor sums — the same CASE shape produces a count column or a sum column depending on theTHENexpression. -
ELSE 0everywhere — guarantees empty cells render as0(notNULL); makes the output safe for downstream arithmetic. -
Repeating the CASE expression in
ORDER BY— column aliases cannot be referenced inside aggregates inORDER BYon every dialect; repeating the expression is the portable choice. -
Cost — single sequential scan
Θ(n)plus hash aggregateΘ(n); the 8 CASE expressions are a constant per-row factor.
SQL
Topic — conditional-aggregation
Pivot pattern drills
SQL
Topic — aggregations
Aggregation patterns
6. CASE in WHERE, GROUP BY, ORDER BY, and HAVING
CASE works in every clause — pick the right slot for your intent
CASE WHEN is not confined to the SELECT list — it can drive filtering (WHERE), grouping (GROUP BY), filtering of aggregates (HAVING), and custom sort priority (ORDER BY). Knowing which slot to reach for is its own interview signal.
CASE in WHERE.
-
Rare in practice — usually refactorable to
IN,OR, or boolean combinations. -
Useful when — the condition depends on a column value (e.g. "if
region = 'EU', requiregdpr_consent = TRUE; otherwise no consent check"). -
Shape —
WHERE CASE WHEN region = 'EU' THEN gdpr_consent ELSE TRUE END. -
Reads better as —
WHERE (region <> 'EU') OR (region = 'EU' AND gdpr_consent = TRUE).
CASE in GROUP BY.
-
Group by a tier label —
GROUP BY CASE WHEN salary >= 100000 THEN 'high' ELSE 'low' END. -
Group by a computed bucket —
GROUP BY CASE WHEN age < 18 THEN 'minor' WHEN age < 65 THEN 'adult' ELSE 'senior' END. -
Cannot reference the
SELECTalias — repeat the full CASE expression inGROUP BY(logical order issue). -
Cleanest pattern — define the CASE in a CTE, then
GROUP BY tierin the outer query (Blog72 §3 reference).
CASE in ORDER BY.
-
Custom sort priority —
ORDER BY CASE status WHEN 'urgent' THEN 0 WHEN 'high' THEN 1 ELSE 2 END. -
Sort tier labels in business order — alphabetic sort would scramble
Executive/Senior/Mid/Junior; a CASE expression enforces the right order without an explicit rank column. -
ORDER BYcan reference SELECT aliases —ORDER BY tierworks iftieris a CASE alias in the sameSELECT, but the CASE expression itself is also legal. -
Pair with
DESCfor reverse priority —ORDER BY CASE … END DESC.
CASE in HAVING.
-
Filter aggregates produced by CASE —
HAVING SUM(CASE WHEN tier = 'paid' THEN 1 ELSE 0 END) > 0. -
Multi-condition existence checks — "pages with both paid and free users" →
HAVING SUM(CASE WHEN tier = 'paid' THEN 1 ELSE 0 END) > 0 AND SUM(CASE WHEN tier = 'free' THEN 1 ELSE 0 END) > 0. -
Always references aggregates —
HAVINGruns afterGROUP BY, so bare-column CASE expressions are illegal here unless wrapped in an aggregate.
Same alias rules apply.
-
SELECTaliases are visible inORDER BY—ORDER BY tierworks. -
SELECTaliases are NOT visible inWHERE/GROUP BY/HAVING— logical processing order; the alias does not yet exist when those clauses run. - Workaround — repeat the full CASE expression in the earlier clause, OR wrap the query in a CTE / subquery and filter the outer scope.
Worked example — custom sort order for ticket priority
Question. From tickets(ticket_id, status, opened_at), return every ticket sorted with urgent first, then high, then normal, then low. Inside each priority group, sort by opened_at ascending (oldest first).
Input. tickets slice.
| ticket_id | status | opened_at |
|---|---|---|
| 101 | normal | 2026-05-19 |
| 102 | urgent | 2026-05-20 |
| 103 | high | 2026-05-18 |
| 104 | low | 2026-05-17 |
| 105 | urgent | 2026-05-19 |
Code.
SELECT ticket_id,
status,
opened_at
FROM tickets
ORDER BY
CASE status
WHEN 'urgent' THEN 0
WHEN 'high' THEN 1
WHEN 'normal' THEN 2
WHEN 'low' THEN 3
ELSE 4
END,
opened_at ASC;
Step-by-step explanation.
- The CASE expression assigns a numeric sort key per ticket —
urgentgets0,highgets1, etc. - The planner sorts rows by the CASE key first; ties on the CASE key are broken by
opened_atascending. - Within
urgent, ticket 105 (2026-05-19) appears before ticket 102 (2026-05-20). - The
ELSE 4lands any unexpected status values at the bottom.
Output.
| ticket_id | status | opened_at |
|---|---|---|
| 105 | urgent | 2026-05-19 |
| 102 | urgent | 2026-05-20 |
| 103 | high | 2026-05-18 |
| 101 | normal | 2026-05-19 |
| 104 | low | 2026-05-17 |
SQL
Topic — conditional-filtering
Conditional-filtering drills
SQL
Topic — case-when
CASE WHEN SQL library
7. CASE with NULLIF and COALESCE — safe arithmetic patterns
NULLIF guards divide-by-zero; COALESCE replaces NULL with a default — both pair beautifully with CASE
CASE WHEN rarely ships alone. The two conditional functions that pair best with it are NULLIF (returns NULL when two values are equal) and COALESCE (returns the first non-NULL argument). Together they let you write bulletproof safe-division and empty-group fallbacks.
NULLIF(a, b) at a glance.
-
Shape —
NULLIF(a, b). -
Returns —
NULLwhena = b; otherwise returnsa. -
Equivalent to —
CASE WHEN a = b THEN NULL ELSE a END. -
Classic use —
x / NULLIF(y, 0)returnsNULL(instead of erroring) wheny = 0.
COALESCE(a, b, c, …) at a glance.
-
Shape —
COALESCE(a, b, c, …). -
Returns — the first non-
NULLargument from left to right. -
Use cases — replace
NULLwith0, with a default string, or with a fallback column. -
Short-circuit — stops evaluating after the first non-
NULLis found.
Safe-division template.
-
Pattern —
<numerator> / NULLIF(<denominator>, 0). -
Behaviour — returns
NULLwhen the denominator is zero; everywhere else returns the quotient. -
Pair with
COALESCE—COALESCE(<numerator> / NULLIF(<denominator>, 0), 0)replaces theNULLwith0for downstream arithmetic. - Use it for — success rates, click-through rates, refund rates, conversion rates, churn rates — every ratio where the denominator could legitimately be zero.
Empty-group bulletproofing.
-
Pattern —
COALESCE(SUM(CASE WHEN cond THEN amount ELSE 0 END), 0). -
Why —
SUMover zero rows returnsNULL(not0); withoutCOALESCE, the empty group renders asNULLand breaks downstream consumers. -
Combined with
FILTER(Postgres) —COALESCE(SUM(amount) FILTER (WHERE cond), 0).
Composition order — outer to inner.
-
COALESCEon the outside — final fallback so the result is neverNULL. -
NULLIFon the inside — protects intermediate expressions (denominators) from invalid values. -
CASTlast (or first) — explicit type casting prevents implicit-cast surprises. -
Example —
ROUND(COALESCE(SUM(CASE WHEN succeeded THEN 1 ELSE 0 END) * 1.0 / NULLIF(COUNT(*), 0), 0), 4)reads "round the success rate; fall back to 0 if no rows; guard divide-by-zero with NULLIF; force float division with* 1.0."
SQL interview question — success rate per region with safe division
Assume events(event_id, region, succeeded) where succeeded is a boolean. Return per region: total events, successes, failures, and success_rate (successes / total). Regions with zero events should render as success_rate = 0, not crash.
Solution Using SUM(CASE WHEN …) + NULLIF
Code solution.
SELECT region,
COUNT(*) AS total_events,
SUM(CASE WHEN succeeded THEN 1 ELSE 0 END) AS successes,
SUM(CASE WHEN NOT succeeded THEN 1 ELSE 0 END) AS failures,
ROUND(
COALESCE(
SUM(CASE WHEN succeeded THEN 1 ELSE 0 END) * 1.0
/ NULLIF(COUNT(*), 0),
0
),
4
) AS success_rate
FROM events
GROUP BY region
ORDER BY success_rate DESC;
Step-by-step trace.
| step | relation | outcome |
|---|---|---|
| 1 | Scan events, group by region |
hash aggregate |
| 2 | SUM(CASE WHEN succeeded THEN 1 ELSE 0 END) |
counts successes per region |
| 3 | SUM(CASE WHEN NOT succeeded THEN 1 ELSE 0 END) |
counts failures per region |
| 4 | COUNT(*) |
total events per region |
| 5 | NULLIF(COUNT(*), 0) |
converts zero-event groups to NULL, preventing divide-by-zero |
| 6 | successes * 1.0 / NULLIF(COUNT(*), 0) |
float division; NULL when denominator is zero |
| 7 | COALESCE(…, 0) |
replaces the NULL with 0 for empty groups |
| 8 | ROUND(…, 4) |
normalises the output to 4 decimal places |
Output:
| region | total_events | successes | failures | success_rate |
|---|---|---|---|---|
| US | 1,000 | 950 | 50 | 0.9500 |
| EU | 800 | 760 | 40 | 0.9500 |
| APAC | 600 | 540 | 60 | 0.9000 |
Why this works — concept by concept:
-
SUM(CASE WHEN succeeded THEN 1 ELSE 0) — counts successful events without a separate filtered query; the explicit
ELSE 0prevents the silent-undercount NULL trap. -
NULLIF(COUNT(*), 0) — turns the zero-denominator case into
NULL, which propagates safely instead of raising a divide-by-zero error. -
* 1.0for float division — without it, MySQL and older Postgres versions perform integer division and truncatesuccesses / totalto0or1. -
COALESCE(…, 0) — the outer fallback ensures zero-event groups render as
0.0000rather thanNULL; safe for downstream BI tools. - ROUND(…, 4) — normalises the precision; reviewers expect financial / rate metrics to come back with a stable decimal scale.
-
Cost — single sequential scan
Θ(n)plus hash aggregate; the three CASE expressions and the safe-division wrappers are constant per-row factors.
SQL
Topic — conditional-aggregation
Safe-aggregation drills
SQL
Topic — aggregation
Aggregation SQL drills
8. Dialect alternatives — FILTER, IIF, DECODE, IF
Same semantics, cleaner syntax — know the dialect-specific shorthand
Every major dialect supports the standard CASE WHEN syntax, but most also ship a shorter idiom for the conditional-aggregation use case. Knowing the alternatives is a senior-signal in panels at companies that lean on a specific stack.
PostgreSQL FILTER (WHERE …).
-
Shape —
SUM(amount) FILTER (WHERE status = 'paid'). -
Replaces —
SUM(CASE WHEN status = 'paid' THEN amount ELSE 0 END). -
Why it wins — reads top-to-bottom as "sum amounts, filtered to paid"; no
ELSE 0needed. - Identical plan — the planner emits the same hash aggregate; this is pure readability sugar.
- Available since — PostgreSQL 9.4 (and SQL standard SQL:2003).
SQL Server IIF(cond, then, else).
-
Shape —
IIF(channel = 'paid', revenue, 0). -
Replaces —
CASE WHEN channel = 'paid' THEN revenue ELSE 0 ENDfor binary conditions. - Limitation — only two branches; for >2 branches, fall back to CASE.
-
Use case — concise conditional aggregation:
SUM(IIF(channel = 'paid', revenue, 0)).
MySQL IF(cond, then, else).
-
Shape —
IF(channel = 'paid', revenue, 0). -
Identical to SQL Server's
IIF— same binary-condition shorthand. -
Use case —
SUM(IF(channel = 'paid', revenue, 0))for compact conditional aggregation in MySQL. -
Not portable —
IFdoes not exist in Postgres or Oracle; useCASEif your query must run elsewhere.
Oracle DECODE(expr, val1, res1, val2, res2, …, default).
-
Shape —
DECODE(channel, 'paid', revenue, 'organic', revenue, 0). -
Replaces —
CASE channel WHEN 'paid' THEN revenue WHEN 'organic' THEN revenue ELSE 0 END(simple form). -
Limitations — equality only (no ranges); legacy syntax; new Oracle code prefers
CASE. -
Recognise it — DBA-heavy panels at Oracle shops may quiz you on
DECODEsemantics.
Snowflake / BigQuery.
-
Standard
CASEonly — neitherIIFnorDECODEis available. -
Use
CASE WHEN— the portable choice across cloud-native warehouses. -
BigQuery
IF(cond, then, else)— available as a function; same semantics as MySQL'sIF.
Portability rule.
-
Write
CASE WHENwhen the same query must run across multiple dialects (e.g. dbt models that target both Postgres and Snowflake). -
Reach for
FILTER/IIF/IFonly when the codebase is single-dialect and the team prefers the shorter idiom. - Document dialect-specific code with a comment so future maintainers know why it's there.
Worked example — same query in four dialects
Question. Show paid revenue and refunded amount per region using the cleanest idiom available in PostgreSQL, SQL Server, MySQL, and Oracle.
PostgreSQL (FILTER).
SELECT region,
SUM(amount) FILTER (WHERE status = 'paid') AS paid_revenue,
SUM(amount) FILTER (WHERE status = 'refund') AS refunded_amount
FROM orders
GROUP BY region;
SQL Server (IIF).
SELECT region,
SUM(IIF(status = 'paid', amount, 0)) AS paid_revenue,
SUM(IIF(status = 'refund', amount, 0)) AS refunded_amount
FROM orders
GROUP BY region;
MySQL (IF).
SELECT region,
SUM(IF(status = 'paid', amount, 0)) AS paid_revenue,
SUM(IF(status = 'refund', amount, 0)) AS refunded_amount
FROM orders
GROUP BY region;
Oracle / portable (CASE WHEN).
SELECT region,
SUM(CASE WHEN status = 'paid' THEN amount ELSE 0 END) AS paid_revenue,
SUM(CASE WHEN status = 'refund' THEN amount ELSE 0 END) AS refunded_amount
FROM orders
GROUP BY region;
Rule of thumb: default to CASE WHEN unless the codebase is locked to one dialect; even then, mention the standard form alongside the shortcut so reviewers know you understand both.
SQL
Topic — case-when
Dialect-portable CASE WHEN drills
SQL
Topic — conditional-aggregation
Conditional aggregation library
Choosing where to put CASE (cheat sheet)
A one-screen cheat sheet for using case statements in SQL — pick the clause that matches your intent, then copy the corresponding shape into your query.
| You want to … | Put CASE in … | Notes |
|---|---|---|
| Derive a tier label or computed column |
SELECT projection |
Use AS <alias> for readability |
| Count rows matching a predicate | SUM(CASE WHEN cond THEN 1 ELSE 0 END) |
Or COUNT(*) FILTER (WHERE cond) in Postgres |
| Sum a value only when a condition holds | SUM(CASE WHEN cond THEN value ELSE 0 END) |
Always include ELSE 0 to avoid the silent-undercount NULL trap |
| Pivot rows to columns |
SUM(CASE WHEN dim = 'x' THEN val ELSE 0 END) per output column |
Portable replacement for PIVOT
|
| Group rows by a tier | GROUP BY CASE WHEN … THEN … END |
Repeat the CASE expression; the SELECT alias is not visible here |
| Filter aggregates by a condition | HAVING SUM(CASE WHEN … THEN 1 ELSE 0 END) > 0 |
Multi-condition existence checks |
| Custom sort priority | ORDER BY CASE WHEN status = … THEN 0 … END |
Numeric sort keys per business priority |
| Apply a column-dependent filter | WHERE CASE WHEN region = 'EU' THEN gdpr_consent ELSE TRUE END |
Often refactorable to OR / AND; reach for CASE when the condition depends on the row |
| Safe division | <num> / NULLIF(<denom>, 0) |
Pair with COALESCE(..., 0) for empty-group fallback |
| Replace NULL with a default | COALESCE(<col>, <default>) |
Equivalent to CASE WHEN <col> IS NULL THEN <default> ELSE <col> END
|
Frequently asked questions
What's the difference between CASE WHEN and IF-ELSE in SQL?
CASE WHEN is the ANSI-standard conditional expression and works in every major SQL dialect; it evaluates predicates top-to-bottom and returns the value tied to the first match, or ELSE (or NULL) when nothing matches. IF / IIF are dialect-specific shorthand for the binary-condition case — MySQL has IF(cond, then, else), SQL Server has IIF(cond, then, else), and both reduce to a single CASE WHEN cond THEN then ELSE else END. Use CASE WHEN for anything beyond two branches and for portability; use IF / IIF only when the codebase is locked to a single dialect and the team prefers the shorter idiom.
What's the difference between the searched form and the simple form of CASE?
The searched form (CASE WHEN <bool_expr> THEN … END) accepts arbitrary boolean predicates per WHEN — ranges, IS NULL, compound AND / OR are all legal. The simple form (CASE <expr> WHEN <value> THEN … END) compares one expression to a series of constant values for equality only; it cannot express ranges or null checks. The searched form is strictly more expressive and is the one you should default to. Reach for the simple form only when every branch is <expr> = <constant> and the equality-only shape genuinely reads cleaner.
How do I use CASE WHEN with SUM (sum case when sql)?
Wrap a CASE inside SUM — the sum case when sql pattern — to compute a metric only for rows that satisfy a predicate: SUM(CASE WHEN <cond> THEN <value> ELSE 0 END) AS <metric>. This is the canonical interview pattern — it lets you compute multiple KPIs in one grouped query instead of running multiple filtered queries, joining them, or building temp tables. The single most important rule is to always include ELSE 0 — without it, false-branch rows return NULL, SUM ignores NULL, and the total is silently undercounted. For counting rows, use THEN 1 ELSE 0; for summing values, use THEN <value> ELSE 0.
How do I pivot rows to columns using CASE WHEN?
Write one SUM(CASE WHEN <category_col> = '<value>' THEN <metric> ELSE 0 END) AS <metric>_<value> per output column, then GROUP BY the dimension you want as rows. Each category becomes its own column with a stable name derived from your literal predicates. This pattern is far more portable than dialect-specific PIVOT operators (SQL Server's PIVOT, Postgres's crosstab extension), works on every database that supports CASE, and extends easily — adding a new column is just one more SUM(CASE …) expression. The only limitation is that the column list must be known when you write the SQL; for dynamic pivots, reach for procedural SQL or application-side templating.
What's the PostgreSQL FILTER clause and how does it relate to CASE WHEN?
The PostgreSQL FILTER (WHERE <cond>) clause is a cleaner native idiom for conditional aggregation: SUM(amount) FILTER (WHERE status = 'paid') is identical in plan and result to SUM(CASE WHEN status = 'paid' THEN amount ELSE 0 END). It reads top-to-bottom as "sum amounts, filtered to paid rows," requires no ELSE 0, and is part of the SQL:2003 standard. FILTER works on every aggregate (SUM, COUNT, AVG, MIN, MAX) and is the preferred Postgres-native form. For portability across MySQL, SQL Server, Oracle, Snowflake, and BigQuery — none of which support FILTER — fall back to CASE WHEN.
Practice on PipeCode
PipeCode ships 450+ data-engineering interview problems — including PostgreSQL-first SQL practice keyed to CASE WHEN, conditional logic, conditional aggregation, pivot-via-CASE patterns, and safe-arithmetic drills with NULLIF + COALESCE.
Kick off via Explore practice →; drill the dedicated CASE WHEN SQL lane →; fan out into the conditional-logic SQL lane →; reinforce conditional-aggregation drills →; rehearse aggregation SQL drills →; widen coverage on the full SQL practice library →.





Top comments (0)