sql pivot turns long-format rows into wide-format columns: one row per category becomes one column per category, with the aggregate value populating each cell. sql unpivot is the inverse — one column per metric becomes one row per metric, restoring the long shape that downstream pipelines and BI tools prefer. These two reshape operators answer the bulk of the sql interview questions in the "make this data wide" / "make this data long" cluster, and getting pivot in sql right — especially the dialect divergence between SQL Server's native PIVOT operator and PostgreSQL's SUM(CASE WHEN …) idiom — separates candidates who pass screening from candidates who don't.
This guide walks through every clause in the sql pivot unpivot family that reviewers love to test in data engineering interview questions: the long-vs-wide mental model, the native PIVOT / UNPIVOT syntax in SQL Server / Oracle / Snowflake, the portable SUM(CASE WHEN …) idiom that works on every dialect (PostgreSQL, MySQL, every flavour without a PIVOT keyword), how to handle dynamic column lists when the pivot values are unknown at query time, the inverse UNPIVOT and its portable UNION ALL rewrite, and the seven gotchas (NULL cells, dynamic SQL, non-unique pivot keys) that fail most candidates. Every section ends as sql interview questions with answers: a runnable PostgreSQL query, a traced execution, an output table, and a concept-by-concept why this works breakdown — the exact shape sql for data engineers rounds reward when pivot table in sql comes up.
When you want hands-on reps immediately after reading, browse pivoting SQL practice →, drill the unpivoting lane →, sharpen conditional aggregation problems →, rehearse aggregation SQL drills →, or widen coverage on the full SQL practice library →.
On this page
- Why PIVOT and UNPIVOT matter in data engineering interviews
- Long format vs wide format — the reshape mental model
- The portable SUM(CASE WHEN …) pivot — works on every dialect
- Native PIVOT syntax — SQL Server, Oracle, Snowflake
- UNPIVOT and the inverse reshape
- Dynamic pivots — when the column list is not known at write time
- PIVOT gotchas — NULL cells, non-unique keys, dynamic SQL hazards
- Choosing the right reshape (cheat sheet)
- Frequently asked questions
- Practice on PipeCode
1. Why PIVOT and UNPIVOT matter in data engineering interviews
Reshape rows ↔ columns — that's the whole interview surface in two sentences
The one-sentence invariant: sql pivot turns long-format rows (one row per (entity, metric) pair) into wide-format columns (one row per entity, one column per metric), with an aggregate populating each cell; sql unpivot does the inverse — one wide column per metric becomes one long row per metric. Once you internalise that, every prompt in the pivot table sql / unpivot in sql family becomes "pick the reshape direction and the aggregate."
The two operators at a glance. Together they answer most pivot table sql and pivot table in sql interview prompts.
-
PIVOT— long → wide. One row per group key, one new column per distinct value of the "pivot column", each cell populated by an aggregate (typicallySUMorCOUNT). -
UNPIVOT— wide → long. One wide table with N metric columns becomes a long table withN × row_countrows, each row carrying a metric name and its value. Thesql unpivotoperator is dialect-native on SQL Server, Oracle, Snowflake, BigQuery, and Databricks; the portableUNION ALLrewrite covers PostgreSQL and MySQL.
The two strategies for PIVOT you must know.
-
Native
PIVOToperator — built into SQL Server, Oracle, and Snowflake; concise syntax with aFOR … IN (…)list of values. Not available in PostgreSQL, MySQL, BigQuery (BigQuery added it in 2022 but it differs from SQL Server's syntax). -
Portable
SUM(CASE WHEN col = 'X' THEN value ELSE 0 END)idiom — works on every dialect; ties directly back to the conditional-aggregation pattern from Blog72; this is the default for PostgreSQL / MySQL.
Why interviewers love PIVOT / UNPIVOT.
- Every dashboard, BI export, and pivot table is a reshape — sales by region × month, headcount by team × quarter, conversions by source × week.
-
Tests two fluencies at once —
GROUP BYgrain awareness (which column is the row key?) and conditional logic (which column does each row contribute to?). -
Dialect divergence is substantial — native
PIVOTsyntax differs across SQL Server / Oracle / Snowflake / BigQuery; the portable idiom works everywhere. -
The
UNPIVOTdirection surfaces "long format is the data-engineer's friend" — most pipelines, BI tools, and warehouse models prefer long format because it scales gracefully when new categories appear.
What interviewers listen for.
- Do you name the row key and the column key before writing the query? — "the row key is
region; the column key isquarter; the cell value isSUM(amount)." - Do you reach for the portable
SUM(CASE WHEN …)idiom when the dialect is PostgreSQL or MySQL? — the answer that travels. - Do you mention
ELSE 0discipline to avoid the silent-NULL undercount from Blog72? — bonus points. - Do you flag the dynamic-pivot trap when the column list is unknown at write time? — senior signal (the answer involves dynamic SQL or downstream reshape in pandas).
Worked example — pivot quarterly revenue per region
Detailed explanation. When a stakeholder asks for "quarterly revenue per region in a spreadsheet shape", the natural output is one row per region with four columns — Q1, Q2, Q3, Q4. The input is long format (one row per (region, quarter, amount)); the output is wide format (one row per region, four metric columns).
Question. From sales(region, quarter, amount), return one row per region with four columns (q1, q2, q3, q4) holding total revenue per quarter.
Input. sales slice.
| region | quarter | amount |
|---|---|---|
| US | Q1 | 1200 |
| US | Q2 | 1500 |
| US | Q3 | 1700 |
| US | Q4 | 2000 |
| EU | Q1 | 800 |
| EU | Q2 | 950 |
| EU | Q3 | 1100 |
| EU | Q4 | 1300 |
Code (portable, works on every dialect).
SELECT region,
SUM(CASE WHEN quarter = 'Q1' THEN amount ELSE 0 END) AS q1,
SUM(CASE WHEN quarter = 'Q2' THEN amount ELSE 0 END) AS q2,
SUM(CASE WHEN quarter = 'Q3' THEN amount ELSE 0 END) AS q3,
SUM(CASE WHEN quarter = 'Q4' THEN amount ELSE 0 END) AS q4
FROM sales
GROUP BY region
ORDER BY region;
Step-by-step explanation.
-
FROM salesreads every detail row. -
GROUP BY regioncollapses to one row per region — the row key. - Each
SUM(CASE WHEN quarter = 'Qx' THEN amount ELSE 0 END)addsamountonly when the row matches that quarter, otherwise contributes0. -
ELSE 0(not omitted) keepsSUMreturning0instead ofNULLfor regions missing a quarter — the discipline from Blog72. - The output grain is one row per region with four metric columns — the wide shape stakeholders want.
Output.
| region | q1 | q2 | q3 | q4 |
|---|---|---|---|---|
| EU | 800 | 950 | 1100 | 1300 |
| US | 1200 | 1500 | 1700 | 2000 |
Rule of thumb: identify the row key first (region), then the column key (quarter) and its known values (Q1..Q4), then write one SUM(CASE WHEN …) per known value — that's the universal pivot template.
SQL
Topic — case-when
SUM(CASE WHEN …) pivot drills
SQL
Topic — conditional-aggregation
Conditional aggregation problems
2. Long format vs wide format — the reshape mental model
Two shapes for the same data — pick the one that matches your downstream
Before any pivot syntax, the mental model: the same dataset can be expressed as long format (narrow, many rows, fixed columns) or wide format (short, fewer rows, more columns). Knowing which shape you want — and which shape your downstream consumer wants — is half the battle in any reshape interview question.
Long format — one row per (entity, metric, value).
-
Shape — typically three or four columns:
entity_id,metric_name,metric_value, optionaltimestamp. -
Row count —
entities × metrics; grows linearly when new metrics or entities arrive. - Use it for — warehouse storage, ETL staging, event logs, dashboards built on Looker / Tableau / Metabase (which expect tidy long data).
-
Why data engineers prefer it — adding a new metric does not require schema changes; the new metric just becomes new rows with a new
metric_namevalue. - The default shape in OLTP and most warehouse fact tables.
Wide format — one row per entity, one column per metric.
-
Shape —
entity_idplus N metric columns; each column holds one specific metric's value. - Row count — equal to the number of distinct entities; columns equal the number of metrics.
- Use it for — spreadsheet exports, BI dashboards that expect a "cube" shape, reports formatted for non-technical readers.
- Why analysts and stakeholders prefer it — visually compact; "looks like a pivot table"; trivially cross-readable left-to-right.
- The output shape that PIVOT produces and that UNPIVOT consumes.
When to pivot (long → wide).
- The downstream consumer (Excel, a sheet, a static report) expects a pivot-table shape.
- The metric list is small, stable, and known in advance (
Q1/Q2/Q3/Q4; not "all sub-products since 2018"). - The number of pivot columns is bounded — wide tables with thousands of columns are anti-patterns.
When to unpivot (wide → long).
- Upstream system emits wide data (a CSV from finance with one column per month) and you need to load it into a long fact table.
- A model expects tidy data and the source is wide.
- Joining two wide tables on a
(entity, metric)grain — unpivot to long, join, then pivot back if needed.
Why long is the canonical storage shape for data engineering.
-
Adding a new metric is
INSERT … VALUES (id, 'new_metric', val)— no schema change. -
Adding a new entity is
INSERT— no schema change. - Schema stability — long tables typically have 3–6 columns and never change; wide tables grow unbounded.
-
Aggregations and joins are cleaner on long tables —
GROUP BY metric_namepartitions naturally.
The reshape arrows.
-
Long → Wide =
PIVOT(native operator) ORSUM(CASE WHEN …)(portable idiom). -
Wide → Long =
UNPIVOT(native operator) ORUNION ALLofSELECT entity, 'm1', m1 FROM t UNION ALL SELECT entity, 'm2', m2 FROM t …(portable idiom).
SQL
Topic — unpivoting
Unpivoting SQL drills
SQL
Topic — data-aggregation
Data aggregation library
3. The portable SUM(CASE WHEN …) pivot — works on every dialect
sql pivot without pivot — the universal pivot idiom every data engineer should know
The portable pivot pattern works on every SQL dialect ever shipped, predates the native PIVOT keyword by decades, and is the only option in PostgreSQL, MySQL, SQLite, and most cloud warehouses without a dedicated operator. It is the sql pivot example every interviewer expects you to know.
The anatomy of the portable pivot.
-
GROUP BY <row_key>— defines the output grain; one row per distinct value of the row key. -
One
SUM(CASE WHEN <col_key> = '<value>' THEN <metric> ELSE 0 END)per known column value — each populates one wide-format column. -
ELSE 0keeps the aggregate at0instead ofNULLfor missing combinations. -
COUNT(CASE WHEN cond THEN 1 END)for count-pivots (ELSEcan be omitted becauseCOUNTskipsNULL).
The template.
SELECT <row_key>,
SUM(CASE WHEN <col_key> = '<value_1>' THEN <metric> ELSE 0 END) AS <alias_1>,
SUM(CASE WHEN <col_key> = '<value_2>' THEN <metric> ELSE 0 END) AS <alias_2>,
…
FROM <table>
GROUP BY <row_key>
ORDER BY <row_key>;
Why the portable idiom is preferred even when PIVOT exists.
-
Portability — the same query works on Postgres, MySQL, SQL Server, Snowflake, BigQuery; native
PIVOTis a translation step. -
Multiple aggregates per pivot column — native
PIVOTtypically allows one aggregate per call;SUM(CASE WHEN …)lets you computeSUMandCOUNTin the same query. -
Custom predicates per cell —
WHEN amount > 100 AND quarter = 'Q1'is trivial inCASE; awkward in nativePIVOT. -
No
IN (…)list constraint — nativePIVOTrequires the value list inline;CASE WHENaccepts any expression.
ELSE 0 discipline — the Blog72 callback.
-
Without
ELSE 0—SUMreturnsNULLfor groups where no row matches theWHENcondition; downstream arithmetic propagates theNULLand undercounts. -
With
ELSE 0— the missing-data case is explicitly zero; pivoted dashboards render0, not blank. -
Senior signal — stating "I use
ELSE 0to avoid the silent-NULL undercount" is the pattern from Blog72 applied to pivot cells.
Multiple metrics in one pivot query.
- Combine
SUMandCOUNTin the same row by adding moreCASE WHENcolumns. - Example — per region, both total revenue per quarter AND order count per quarter:
SELECT region,
SUM(CASE WHEN quarter = 'Q1' THEN amount ELSE 0 END) AS q1_rev,
SUM(CASE WHEN quarter = 'Q2' THEN amount ELSE 0 END) AS q2_rev,
COUNT(CASE WHEN quarter = 'Q1' THEN 1 END) AS q1_orders,
COUNT(CASE WHEN quarter = 'Q2' THEN 1 END) AS q2_orders
FROM sales
GROUP BY region;
Filtering inside the cell with FILTER (Postgres 9.4+).
-
Postgres-specific cleaner syntax —
SUM(amount) FILTER (WHERE quarter = 'Q1') AS q1_rev. -
Equivalent to
SUM(CASE WHEN quarter = 'Q1' THEN amount END). - More readable for one-off pivots; not portable to MySQL / SQL Server.
SQL interview question — paid vs refunded revenue per region in a wide layout
Assume orders(region, status, amount) with status ∈ {'paid', 'refund', 'pending'}. Return one row per region with three columns — paid_rev, refund_amount, pending_amount — totalled across all orders.
Solution Using SUM(CASE WHEN …)
Detailed explanation. This is a textbook conditional-aggregation pivot from Blog72 applied to a region × status grain. The row key is region; the column key is status with three known values; the cell metric is SUM(amount). The ELSE 0 keeps regions with no refunds rendering 0, not NULL.
Code (PostgreSQL / works on every dialect).
SELECT region,
SUM(CASE WHEN status = 'paid' THEN amount ELSE 0 END) AS paid_rev,
SUM(CASE WHEN status = 'refund' THEN amount ELSE 0 END) AS refund_amount,
SUM(CASE WHEN status = 'pending' THEN amount ELSE 0 END) AS pending_amount
FROM orders
GROUP BY region
ORDER BY paid_rev DESC;
Equivalent using Postgres FILTER clause.
SELECT region,
SUM(amount) FILTER (WHERE status = 'paid') AS paid_rev,
SUM(amount) FILTER (WHERE status = 'refund') AS refund_amount,
SUM(amount) FILTER (WHERE status = 'pending') AS pending_amount
FROM orders
GROUP BY region
ORDER BY paid_rev DESC;
Step-by-step trace.
| step | relation | outcome |
|---|---|---|
| 1 | Scan orders
|
every order row read |
| 2 | GROUP BY region |
hash aggregate; one row per region |
| 3 | Three SUM(CASE WHEN …) per group |
three cell totals per row |
| 4 |
ELSE 0 ensures missing-status cells become 0 |
no NULL holes |
| 5 | ORDER BY paid_rev DESC |
top-paying regions first |
Output:
| region | paid_rev | refund_amount | pending_amount |
|---|---|---|---|
| US | 11,200 | 1,300 | 0 |
| EU | 7,900 | 850 | 200 |
| APAC | 5,100 | 300 | 0 |
Why this works — concept by concept:
- GROUP BY region defines the grain — one row per region in the output.
- Three CASE WHEN columns map the three statuses — each populates one wide-format column.
- ELSE 0 keeps cells zero, not NULL — the Blog72 silent-undercount defence applied to pivot cells.
-
FILTER syntax is the Postgres-clean equivalent — drops the
CASEceremony for the same semantics. -
One pass through the data — the planner reads
ordersonce and evaluates all three aggregates in the same hash node. -
Cost —
Θ(n)scan +Θ(n)hash aggregate keyed onregion; the three aggregate calls are a constant per-row factor.
SQL
Topic — conditional-aggregation
Pivot via conditional aggregation
SQL
Topic — case-when
CASE WHEN SQL library
4. Native PIVOT syntax — SQL Server, Oracle, Snowflake
pivot in sql — three dialects with a dedicated keyword
SQL Server, Oracle, and Snowflake ship a native PIVOT operator. It is syntactic sugar over the same SUM(CASE WHEN …) semantics — the planner often produces identical execution plans — but the syntax is concise enough that interviewers like to test whether you can read it. PostgreSQL and MySQL do not have native PIVOT; the portable idiom from §3 is mandatory there.
SQL Server PIVOT — the canonical shape.
SELECT region, [Q1], [Q2], [Q3], [Q4]
FROM (
SELECT region, quarter, amount FROM sales
) src
PIVOT (
SUM(amount)
FOR quarter IN ([Q1], [Q2], [Q3], [Q4])
) AS pvt
ORDER BY region;
-
PIVOT (aggregate FOR column IN (value_list))— one aggregate, one column to pivot, an explicit list of values. -
Bracketed identifiers —
[Q1]is the SQL Server way to quote column names with special characters or reserved words. -
The derived table
srcselects only the columns participating —region(row key),quarter(pivot column),amount(cell value). Any extra columns become extra row-key components.
Oracle PIVOT — same idea, slight syntax tweak.
SELECT * FROM (
SELECT region, quarter, amount FROM sales
)
PIVOT (
SUM(amount)
FOR quarter IN ('Q1' AS q1, 'Q2' AS q2, 'Q3' AS q3, 'Q4' AS q4)
)
ORDER BY region;
-
AS aliasclauses produce nicer column names than'Q1'. - String literals for the value list (Oracle uses single quotes).
-
Same
FOR … IN (…)shape; same one-aggregate-per-call rule.
Snowflake PIVOT — close to SQL Server.
SELECT * FROM (
SELECT region, quarter, amount FROM sales
) src
PIVOT (
SUM(amount)
FOR quarter IN ('Q1', 'Q2', 'Q3', 'Q4')
) AS pvt
ORDER BY region;
- String literals for the value list (no brackets).
-
PIVOT (... FOR ... IN (... ANY))since 2023 — dynamic value list (computed at query time). -
One aggregate per
PIVOT— same constraint as SQL Server / Oracle.
BigQuery PIVOT — closest to the standard.
SELECT * FROM (
SELECT region, quarter, amount FROM sales
)
PIVOT (
SUM(amount)
FOR quarter IN ('Q1', 'Q2', 'Q3', 'Q4')
);
-
No alias required after the
PIVOT (...)clause. - String literals, same FOR / IN shape.
The constraints native PIVOT shares across dialects.
-
One aggregate per
PIVOTcall — to compute bothSUMandCOUNT, run two queries and join them, or use the portableSUM(CASE WHEN …)idiom. -
Values must be enumerated at write time — dynamic value lists require dynamic SQL or warehouse-specific extensions (Snowflake's
ANY). -
Implicit
GROUP BY— every column from the derived table that is not the pivot column or the aggregated metric becomes a row-key column. -
The implicit-grouping trap — if your derived table accidentally projects an extra column (e.g.
order_id), each row oforder_idbecomes its own row key and the pivot collapses to one row per(region, order_id)instead of perregion.
Dialect support matrix.
| Dialect | Native PIVOT
|
Native UNPIVOT
|
Portable SUM(CASE WHEN …)
|
|---|---|---|---|
| PostgreSQL | No | No | Yes — SUM(amount) FILTER (WHERE …) is the clean form |
| MySQL | No | No | Yes |
| SQL Server | Yes (since 2005) | Yes | Yes |
| Oracle | Yes (since 11g) | Yes | Yes |
| Snowflake | Yes | Yes | Yes |
| BigQuery | Yes (since 2022) | Yes | Yes |
| Databricks | Yes | Yes | Yes |
SQL
Topic — case-when
PIVOT syntax SQL drills
SQL
Topic — aggregation
Aggregation + PIVOT patterns
5. UNPIVOT and the inverse reshape
unpivot in sql — wide back to long for warehouse loading
sql unpivot is the inverse of sql pivot: a wide table with N metric columns becomes a long table with N × row_count rows. Every cell becomes a row, with the source column name as a new column and the cell value as another. The shape is exactly what warehouse fact tables and tidy-data dashboards consume — and unpivot in sql shows up in interviews as the "wide-to-long" reshape twin of the pivot prompt.
Native UNPIVOT — SQL Server, Oracle, Snowflake.
SELECT region, quarter, amount
FROM (
SELECT region, q1, q2, q3, q4 FROM quarterly_sales
) src
UNPIVOT (
amount FOR quarter IN (q1, q2, q3, q4)
) AS unpvt;
-
UNPIVOT (value_col FOR name_col IN (col_list))—value_colholds the cell values;name_colholds the source column names. - One row per source cell — four source columns × two regions = eight rows.
-
NULL cells are dropped by default — if
q3 IS NULLfor some region, that row does not appear.UNPIVOT … INCLUDE NULLS(some dialects) keeps them.
Portable UNPIVOT via UNION ALL — every dialect.
SELECT region, 'Q1' AS quarter, q1 AS amount FROM quarterly_sales
UNION ALL
SELECT region, 'Q2', q2 FROM quarterly_sales
UNION ALL
SELECT region, 'Q3', q3 FROM quarterly_sales
UNION ALL
SELECT region, 'Q4', q4 FROM quarterly_sales;
-
One
SELECTper source column — each projects the same shape(region, quarter_literal, value_column). -
UNION ALLstitches them vertically; neverUNION(don't pay the dedup cost). -
NULL handling is explicit — add
WHERE q1 IS NOT NULLper leg if you want to drop NULL cells, or omit to keep them.
PostgreSQL LATERAL + VALUES — the elegant Postgres pattern.
SELECT region, quarter, amount
FROM quarterly_sales,
LATERAL (VALUES
('Q1', q1),
('Q2', q2),
('Q3', q3),
('Q4', q4)
) AS t(quarter, amount);
-
LATERALlets the inlineVALUESreference outer columns (q1,q2, etc.). -
More compact than four
UNION ALLlegs. - PostgreSQL-only — not portable to MySQL or SQL Server, but very clean when available.
jsonb_each + to_jsonb — when columns are dynamic in Postgres.
SELECT region, kv.key AS quarter, (kv.value)::numeric AS amount
FROM quarterly_sales,
LATERAL jsonb_each(to_jsonb(quarterly_sales) - 'region') kv;
-
to_jsonb(row) - 'region'turns the row into a JSON object and drops the row key. -
jsonb_eachexpands the object into key-value rows. - Use case — unpivoting tables with unknown or wildly varying wide columns; trades type safety for flexibility.
Common sql unpivot use cases — when unpivot in sql is the right reshape.
-
Loading wide CSVs into a long fact table — finance dumps a sheet with one column per month; the warehouse ETL
unpivots it before insertion. - Reshaping for Looker / Tableau / Metabase — these tools strongly prefer tidy long data.
-
Computing aggregates across metrics —
MAX(amount)across all quarters per region is trivial once the data is long.
SQL interview question — unpivot wide quarterly_sales into long
Assume quarterly_sales(region, q1, q2, q3, q4). Return a long-format result with columns (region, quarter, amount) — one row per (region, quarter) cell.
Solution Using UNION ALL (portable)
Detailed explanation. The portable unpivot pattern produces one SELECT per source column, then stacks them with UNION ALL. Use UNION ALL (not UNION) because every row is unique by construction — no dedup pass needed.
Code (PostgreSQL / works on every dialect).
SELECT region, 'Q1' AS quarter, q1 AS amount FROM quarterly_sales
UNION ALL
SELECT region, 'Q2', q2 FROM quarterly_sales
UNION ALL
SELECT region, 'Q3', q3 FROM quarterly_sales
UNION ALL
SELECT region, 'Q4', q4 FROM quarterly_sales
ORDER BY region, quarter;
Equivalent in SQL Server (native UNPIVOT).
SELECT region, quarter, amount
FROM (
SELECT region, q1, q2, q3, q4 FROM quarterly_sales
) src
UNPIVOT (
amount FOR quarter IN (q1, q2, q3, q4)
) AS unpvt
ORDER BY region, quarter;
Equivalent in Postgres using LATERAL VALUES.
SELECT region, quarter, amount
FROM quarterly_sales,
LATERAL (VALUES
('Q1', q1), ('Q2', q2), ('Q3', q3), ('Q4', q4)
) AS t(quarter, amount)
ORDER BY region, quarter;
Step-by-step trace.
| step | relation | outcome |
|---|---|---|
| 1 | Four SELECTs each project (region, 'Qx', qx)
|
four streams of equal shape |
| 2 |
UNION ALL stacks them vertically |
one combined stream of N × 4 rows |
| 3 | Each row carries the source column name and the cell value | tidy long shape |
| 4 | ORDER BY region, quarter |
sorted final output |
Output:
| region | quarter | amount |
|---|---|---|
| EU | Q1 | 800 |
| EU | Q2 | 950 |
| EU | Q3 | 1100 |
| EU | Q4 | 1300 |
| US | Q1 | 1200 |
| US | Q2 | 1500 |
| US | Q3 | 1700 |
| US | Q4 | 2000 |
Why this works — concept by concept:
- UNION ALL stacks the four projections — every row is unique by construction; no dedup needed.
-
String literal in each SELECT —
'Q1' AS quarterinjects the source column name as data; this is the unpivot trick. -
The cell value column comes from the corresponding wide column —
q1,q2,q3,q4each populate the same output column. - Native UNPIVOT is syntactic sugar — produces the same plan on every engine that supports it.
-
LATERAL VALUES is the Postgres-clean variant — one row per source row times the number of
VALUEStuples. -
Cost — four scans of
quarterly_sales(the portable form) or one scan (nativeUNPIVOT/LATERAL); the native forms are cheaper at scale.
SQL
Topic — unpivoting
UNPIVOT SQL drills
SQL
Topic — set-operations
UNION ALL + UNPIVOT patterns
6. Dynamic pivots — when the column list is not known at write time
The dynamic-pivot problem — and why it's a senior interview signal
The portable and native pivot patterns both require the column list to be enumerated at write time: Q1 / Q2 / Q3 / Q4 is fine, but product_a / product_b / … / product_2000_new_skus is not. When the pivot values are unknown at write time, you need either dynamic SQL, a warehouse extension (Snowflake's PIVOT (… IN ANY)), or a downstream reshape in pandas / Spark.
The fundamental constraint.
-
Relational SQL is schema-bound —
SELECTprojects a fixed number of named columns; you cannot project N columns where N depends on data values. -
Native
PIVOT (… IN (?, ?, …))still needs the value list at parse time. -
SUM(CASE WHEN col = 'X' THEN val ELSE 0 END)still needs'X'as a literal at write time.
Three approaches when the value list is dynamic.
Approach 1 — dynamic SQL (the classic workaround).
-
Step 1 —
SELECT DISTINCT pivot_col FROM sourceto discover the value list. -
Step 2 — concatenate the value list into a SQL string with one
SUM(CASE WHEN …)per value. -
Step 3 —
EXECUTE(SQL Server'sEXEC sp_executesql, PostgresEXECUTE, OracleEXECUTE IMMEDIATE). - Cost — two database round-trips; SQL string concatenation is hand-rolled; injection risk if user-supplied data is mixed in unsafely.
-- SQL Server example
DECLARE @cols NVARCHAR(MAX), @sql NVARCHAR(MAX);
SELECT @cols = STRING_AGG(QUOTENAME(quarter), ',')
FROM (SELECT DISTINCT quarter FROM sales) q;
SET @sql = N'
SELECT region, ' + @cols + N'
FROM (SELECT region, quarter, amount FROM sales) src
PIVOT (SUM(amount) FOR quarter IN (' + @cols + N')) AS pvt;';
EXEC sp_executesql @sql;
Approach 2 — warehouse-native dynamic pivot (Snowflake's ANY).
-
Snowflake since 2023 —
PIVOT (SUM(amount) FOR quarter IN (ANY))discovers the value list automatically. -
BigQuery — limited dynamic-pivot support via
EXECUTE IMMEDIATEfor the same DDL-string approach. - Cleanest option when available; no string concatenation.
Approach 3 — reshape downstream in pandas / Spark / dbt.
- Pull long-format data from SQL.
-
Reshape in the application layer — pandas
df.pivot_table(index='region', columns='quarter', values='amount', aggfunc='sum'). - Often the right call for production pipelines — keeps SQL simple and pushes dynamic-schema work to a tool designed for it.
When dynamic pivot is the wrong question.
- The value list is "ever-growing" — pivoting on product SKUs, user IDs, etc. is almost always wrong; you'd produce a table with thousands of columns.
- Re-evaluate — is the question really "give me a wide layout" or is it "give me a long table I can render as a pivot in BI"? — usually the latter.
- Senior signal — push back on the requirement when the long shape is what the consumer actually needs.
The interview answer.
- State the constraint clearly — "SQL needs to know the column list at write time."
- Offer three paths — dynamic SQL, warehouse-native dynamic pivot, or downstream reshape.
- Pick the right one for the context — reach for downstream reshape unless the schema is tightly bounded.
- Avoid the trap — never agree to a "thousand-column pivot"; flag the schema-design problem.
SQL
Topic — conditional-aggregation
Dynamic pivot SQL drills
SQL
Topic — data-aggregation
Data aggregation library
7. PIVOT gotchas — NULL cells, non-unique keys, dynamic SQL hazards
The seven bugs interviewers test most often on PIVOT and UNPIVOT
Reshape operators have a small surface area but a long tail of edge cases that fail candidates. These are the seven gotchas reviewers test most often when pivot table in sql comes up in an interview.
Gotcha 1 — SUM returns NULL for missing cells without ELSE 0.
-
The bug —
SUM(CASE WHEN quarter = 'Q3' THEN amount END)(noELSE) returnsNULLwhen the region has no Q3 orders. -
Symptom — dashboards render blank /
—instead of0; downstream arithmetic propagatesNULL. -
Fix — add
ELSE 0in every cell aggregate, or wrap withCOALESCE(…, 0).
Gotcha 2 — Implicit GROUP BY in native PIVOT includes every projected column.
-
The bug — accidentally projecting
order_idinto the derived table makes every row its own group; the pivot collapses to one row per(region, order_id)instead of per region. - Symptom — wide table with millions of rows of one-row-per-order, each with mostly empty cells.
- Fix — project only the row-key columns, the pivot column, and the metric in the derived table.
Gotcha 3 — Pivot value list must match exactly (case + whitespace).
-
The bug —
FOR quarter IN ('q1', 'q2', 'q3', 'q4')when the data contains'Q1','Q2','Q3','Q4'. -
Symptom — every cell is
NULLbecause no row matches. -
Fix — normalise upstream (
LOWER(quarter)orUPPER(quarter)) or list both casings.
Gotcha 4 — UNPIVOT drops NULL cells silently.
-
The behaviour — native
UNPIVOTexcludes rows where the cell isNULL; this is often desired but sometimes surprising. -
Symptom —
UNPIVOToutput has fewer rows than expected when the wide table has missing values. -
Fix —
UNPIVOT … INCLUDE NULLSif your dialect supports it, or use the portableUNION ALLform with explicitWHEREcontrol.
Gotcha 5 — Non-unique row keys produce unexpected aggregates.
-
The bug — pivoting on a row key that is not unique in the source data (e.g.
regionbut the same region appears multiple times per quarter) silently aggregates. - Symptom — cell values are larger than expected because multiple source rows feed each cell.
- Fix — confirm the source grain before pivoting; pre-aggregate to the row-key × pivot-column grain inside a CTE.
Gotcha 6 — Native PIVOT allows only one aggregate per call.
-
The constraint —
PIVOT (SUM(amount) FOR …)andPIVOT (COUNT(*) FOR …)cannot coexist in a singlePIVOTclause. -
Workaround — run two
PIVOTs and join on the row key, OR use the portableSUM(CASE WHEN …)+COUNT(CASE WHEN …)form in one query. - Performance — the portable form is often faster because it scans the source table once.
Gotcha 7 — Dynamic SQL pivots introduce injection risk.
- The bug — concatenating user-supplied data into the value list string opens a SQL injection vector.
- Symptom — security audit failure; potential data exfiltration.
-
Fix — use the dialect's safe quoting function (
QUOTENAMEin SQL Server,quote_literalin Postgres) to escape values; or restrict the discovery query to a known-safe source.
SQL
Topic — case-when
PIVOT gotcha drills
SQL
Topic — unpivoting
UNPIVOT gotcha drills
Choosing the right reshape (cheat sheet)
A one-screen cheat sheet for using SQL PIVOT and UNPIVOT — pick the direction, then the dialect-appropriate idiom.
| You want to … | Direction | Portable idiom | Native (when available) |
|---|---|---|---|
| Long → Wide with one aggregate | PIVOT | SUM(CASE WHEN col = 'X' THEN v ELSE 0 END) |
PIVOT (SUM(v) FOR col IN ('X', …)) |
| Long → Wide with multiple aggregates | PIVOT | One SUM(CASE WHEN …) per metric per value |
Two native PIVOTs joined on the row key |
| Long → Wide with Postgres clean syntax | PIVOT | SUM(v) FILTER (WHERE col = 'X') |
n/a (Postgres has no native PIVOT) |
| Wide → Long | UNPIVOT |
SELECT … UNION ALL SELECT … per source column |
UNPIVOT (val FOR name IN (col1, col2, …)) |
| Wide → Long in Postgres (clean) | UNPIVOT | LATERAL (VALUES ('X', col1), ('Y', col2)) AS t(name, val) |
n/a |
| Dynamic column list | PIVOT | Dynamic SQL + string-concatenated CASE WHENs |
PIVOT (… IN ANY) (Snowflake 2023+) |
| Excel-style pivot table | PIVOT |
SUM(CASE WHEN …) with a row key and column key |
dialect-native PIVOT |
| Reshape for warehouse load | UNPIVOT |
UNION ALL of per-column projections |
dialect-native UNPIVOT |
| Reshape for Looker / Tableau | UNPIVOT |
UNION ALL to long format |
tidy data favoured by BI tools |
| Count of rows per cell | PIVOT | COUNT(CASE WHEN col = 'X' THEN 1 END) |
PIVOT (COUNT(*) FOR col IN (…)) |
| Both SUM and COUNT per cell | PIVOT | One query with two CASE-aggregate columns per value | Two PIVOTs joined |
Frequently asked questions
What is PIVOT in SQL?
PIVOT is a reshape operator that turns long-format rows into wide-format columns. The input is typically a table with three logical columns — a row key (which becomes the rows in the output), a column key (whose distinct values become the new columns), and a metric (whose aggregate populates each cell). For example, sales(region, quarter, amount) can be pivoted on quarter to produce one row per region with four columns (Q1, Q2, Q3, Q4) holding SUM(amount). SQL Server, Oracle, Snowflake, BigQuery, and Databricks have native PIVOT operators with a FOR … IN (…) syntax; PostgreSQL and MySQL have no native PIVOT and rely on the portable SUM(CASE WHEN col = 'X' THEN val ELSE 0 END) idiom inside a GROUP BY query.
How do I pivot rows to columns in PostgreSQL or MySQL?
PostgreSQL and MySQL have no native PIVOT keyword — use the portable SUM(CASE WHEN …) idiom. The template is SELECT row_key, SUM(CASE WHEN col_key = 'value_1' THEN metric ELSE 0 END) AS alias_1, … FROM source GROUP BY row_key. Add one CASE WHEN column per distinct pivot value, with ELSE 0 to keep missing cells as 0 instead of NULL (Blog72 silent-undercount defence). PostgreSQL 9.4+ also supports the cleaner SUM(amount) FILTER (WHERE quarter = 'Q1') syntax, which produces the same result without the CASE ceremony. The idiom is the only portable shape, predates the PIVOT keyword by decades, and is the right answer for any cross-engine codebase.
What's the difference between PIVOT and UNPIVOT?
PIVOT goes long → wide: one row per (entity, metric, value) triple becomes one row per entity with one column per distinct metric. UNPIVOT goes wide → long: one row per entity with N metric columns becomes N × row_count long rows, each carrying a metric name and its value. The two are inverses: PIVOT is what BI tools and stakeholders want for visual reports; UNPIVOT is what warehouse ETL pipelines want for tidy storage. Native UNPIVOT exists on SQL Server, Oracle, Snowflake, BigQuery, and Databricks; the portable form is a UNION ALL of one SELECT entity, 'metric_name' AS m, metric_value FROM source per source column. PostgreSQL has a particularly clean LATERAL (VALUES …) shape that beats both the native and portable forms for readability.
Can I pivot on a column with unknown values at write time?
Not directly with relational SQL — the PIVOT operator and the SUM(CASE WHEN …) idiom both require the value list to be enumerated at parse time. Three workarounds: (1) dynamic SQL — discover the distinct values, concatenate them into a SQL string, and EXECUTE the resulting query; verbose and injection-risky; (2) warehouse-native dynamic pivot — Snowflake's PIVOT (… IN ANY) discovers the value list at query time; the cleanest option when available; (3) reshape downstream — keep SQL queries simple and long-format, then pivot in pandas / Spark / dbt; usually the right production answer. Be wary of "ever-growing" pivot column lists (product SKUs, user IDs) — they signal a schema-design problem; the long shape is almost always what the consumer actually needs, even if they asked for wide.
Why use SUM(CASE WHEN …) instead of native PIVOT when both are available?
Three reasons. Portability — the same query works on Postgres, MySQL, SQL Server, Snowflake, BigQuery; native PIVOT is a per-dialect translation step. Multiple aggregates per pivot column — native PIVOT typically allows one aggregate per call, so computing both SUM and COUNT per cell requires two PIVOTs and a join; SUM(CASE WHEN …) + COUNT(CASE WHEN …) does it in one query with one scan. Custom predicates per cell — WHEN amount > 100 AND quarter = 'Q1' is trivial in CASE; awkward in native PIVOT's IN (…) list. The portable idiom is preferred even on dialects with native PIVOT for any non-trivial pivot; native PIVOT is best reserved for read-only ad-hoc reports where the syntax brevity matters.
Practice on PipeCode
PipeCode ships 450+ data-engineering interview problems — including PostgreSQL-first SQL practice keyed to PIVOT / UNPIVOT, conditional aggregation with SUM(CASE WHEN …), the FILTER clause, set-operations for UNPIVOT via UNION ALL, and the dialect quirks that fail candidates who memorise only one engine. Whether you're drilling sql interview questions with answers for sql for data engineers loops or grinding through data engineering interview questions end-to-end, the practice library mirrors the same long-vs-wide mental model this guide teaches.
Kick off via Explore practice →; drill the CASE WHEN SQL lane →; fan out into the unpivoting lane →; rehearse conditional aggregation problems →; reinforce aggregation SQL drills →; widen coverage on the full SQL practice library →.





Top comments (0)