DEV Community

Cover image for SQL PIVOT and UNPIVOT: Reshape Rows ↔ Columns for Analytics
Gowtham Potureddi
Gowtham Potureddi

Posted on

SQL PIVOT and UNPIVOT: Reshape Rows ↔ Columns for Analytics

sql pivot turns long-format rows into wide-format columns: one row per category becomes one column per category, with the aggregate value populating each cell. sql unpivot is the inverse — one column per metric becomes one row per metric, restoring the long shape that downstream pipelines and BI tools prefer. These two reshape operators answer the bulk of the sql interview questions in the "make this data wide" / "make this data long" cluster, and getting pivot in sql right — especially the dialect divergence between SQL Server's native PIVOT operator and PostgreSQL's SUM(CASE WHEN …) idiom — separates candidates who pass screening from candidates who don't.

This guide walks through every clause in the sql pivot unpivot family that reviewers love to test in data engineering interview questions: the long-vs-wide mental model, the native PIVOT / UNPIVOT syntax in SQL Server / Oracle / Snowflake, the portable SUM(CASE WHEN …) idiom that works on every dialect (PostgreSQL, MySQL, every flavour without a PIVOT keyword), how to handle dynamic column lists when the pivot values are unknown at query time, the inverse UNPIVOT and its portable UNION ALL rewrite, and the seven gotchas (NULL cells, dynamic SQL, non-unique pivot keys) that fail most candidates. Every section ends as sql interview questions with answers: a runnable PostgreSQL query, a traced execution, an output table, and a concept-by-concept why this works breakdown — the exact shape sql for data engineers rounds reward when pivot table in sql comes up.

PipeCode blog header for a SQL PIVOT and UNPIVOT tutorial — bold white headline 'PIVOT ⇄ UNPIVOT' with subtitle 'reshape rows ↔ columns for analytics' and a minimal long-vs-wide table diagram on a dark gradient with purple, green, and orange accents and a small pipecode.ai attribution.

When you want hands-on reps immediately after reading, browse pivoting SQL practice →, drill the unpivoting lane →, sharpen conditional aggregation problems →, rehearse aggregation SQL drills →, or widen coverage on the full SQL practice library →.


On this page


1. Why PIVOT and UNPIVOT matter in data engineering interviews

Reshape rows ↔ columns — that's the whole interview surface in two sentences

The one-sentence invariant: sql pivot turns long-format rows (one row per (entity, metric) pair) into wide-format columns (one row per entity, one column per metric), with an aggregate populating each cell; sql unpivot does the inverse — one wide column per metric becomes one long row per metric. Once you internalise that, every prompt in the pivot table sql / unpivot in sql family becomes "pick the reshape direction and the aggregate."

The two operators at a glance. Together they answer most pivot table sql and pivot table in sql interview prompts.

  • PIVOT — long → wide. One row per group key, one new column per distinct value of the "pivot column", each cell populated by an aggregate (typically SUM or COUNT).
  • UNPIVOT — wide → long. One wide table with N metric columns becomes a long table with N × row_count rows, each row carrying a metric name and its value. The sql unpivot operator is dialect-native on SQL Server, Oracle, Snowflake, BigQuery, and Databricks; the portable UNION ALL rewrite covers PostgreSQL and MySQL.

The two strategies for PIVOT you must know.

  • Native PIVOT operator — built into SQL Server, Oracle, and Snowflake; concise syntax with a FOR … IN (…) list of values. Not available in PostgreSQL, MySQL, BigQuery (BigQuery added it in 2022 but it differs from SQL Server's syntax).
  • Portable SUM(CASE WHEN col = 'X' THEN value ELSE 0 END) idiom — works on every dialect; ties directly back to the conditional-aggregation pattern from Blog72; this is the default for PostgreSQL / MySQL.

Why interviewers love PIVOT / UNPIVOT.

  • Every dashboard, BI export, and pivot table is a reshape — sales by region × month, headcount by team × quarter, conversions by source × week.
  • Tests two fluencies at onceGROUP BY grain awareness (which column is the row key?) and conditional logic (which column does each row contribute to?).
  • Dialect divergence is substantial — native PIVOT syntax differs across SQL Server / Oracle / Snowflake / BigQuery; the portable idiom works everywhere.
  • The UNPIVOT direction surfaces "long format is the data-engineer's friend" — most pipelines, BI tools, and warehouse models prefer long format because it scales gracefully when new categories appear.

What interviewers listen for.

  • Do you name the row key and the column key before writing the query? — "the row key is region; the column key is quarter; the cell value is SUM(amount)."
  • Do you reach for the portable SUM(CASE WHEN …) idiom when the dialect is PostgreSQL or MySQL? — the answer that travels.
  • Do you mention ELSE 0 discipline to avoid the silent-NULL undercount from Blog72? — bonus points.
  • Do you flag the dynamic-pivot trap when the column list is unknown at write time? — senior signal (the answer involves dynamic SQL or downstream reshape in pandas).

Worked example — pivot quarterly revenue per region

Detailed explanation. When a stakeholder asks for "quarterly revenue per region in a spreadsheet shape", the natural output is one row per region with four columns — Q1, Q2, Q3, Q4. The input is long format (one row per (region, quarter, amount)); the output is wide format (one row per region, four metric columns).

Question. From sales(region, quarter, amount), return one row per region with four columns (q1, q2, q3, q4) holding total revenue per quarter.

Input. sales slice.

region quarter amount
US Q1 1200
US Q2 1500
US Q3 1700
US Q4 2000
EU Q1 800
EU Q2 950
EU Q3 1100
EU Q4 1300

Code (portable, works on every dialect).

SELECT region,
       SUM(CASE WHEN quarter = 'Q1' THEN amount ELSE 0 END) AS q1,
       SUM(CASE WHEN quarter = 'Q2' THEN amount ELSE 0 END) AS q2,
       SUM(CASE WHEN quarter = 'Q3' THEN amount ELSE 0 END) AS q3,
       SUM(CASE WHEN quarter = 'Q4' THEN amount ELSE 0 END) AS q4
FROM sales
GROUP BY region
ORDER BY region;
Enter fullscreen mode Exit fullscreen mode

Step-by-step explanation.

  1. FROM sales reads every detail row.
  2. GROUP BY region collapses to one row per region — the row key.
  3. Each SUM(CASE WHEN quarter = 'Qx' THEN amount ELSE 0 END) adds amount only when the row matches that quarter, otherwise contributes 0.
  4. ELSE 0 (not omitted) keeps SUM returning 0 instead of NULL for regions missing a quarter — the discipline from Blog72.
  5. The output grain is one row per region with four metric columns — the wide shape stakeholders want.

Output.

region q1 q2 q3 q4
EU 800 950 1100 1300
US 1200 1500 1700 2000

Rule of thumb: identify the row key first (region), then the column key (quarter) and its known values (Q1..Q4), then write one SUM(CASE WHEN …) per known value — that's the universal pivot template.

SQL
Topic — case-when
SUM(CASE WHEN …) pivot drills

Practice →

SQL
Topic — conditional-aggregation
Conditional aggregation problems

Practice →


2. Long format vs wide format — the reshape mental model

Diagram of long format vs wide format — left card shows a long table with three columns (region, quarter, amount) and eight rows; a purple PIVOT arrow points to the right card which shows a wide table with five columns (region, q1, q2, q3, q4) and two rows; a green UNPIVOT arrow points back from wide to long, on a light PipeCode card.

Two shapes for the same data — pick the one that matches your downstream

Before any pivot syntax, the mental model: the same dataset can be expressed as long format (narrow, many rows, fixed columns) or wide format (short, fewer rows, more columns). Knowing which shape you want — and which shape your downstream consumer wants — is half the battle in any reshape interview question.

Long format — one row per (entity, metric, value).

  • Shape — typically three or four columns: entity_id, metric_name, metric_value, optional timestamp.
  • Row countentities × metrics; grows linearly when new metrics or entities arrive.
  • Use it for — warehouse storage, ETL staging, event logs, dashboards built on Looker / Tableau / Metabase (which expect tidy long data).
  • Why data engineers prefer it — adding a new metric does not require schema changes; the new metric just becomes new rows with a new metric_name value.
  • The default shape in OLTP and most warehouse fact tables.

Wide format — one row per entity, one column per metric.

  • Shapeentity_id plus N metric columns; each column holds one specific metric's value.
  • Row count — equal to the number of distinct entities; columns equal the number of metrics.
  • Use it for — spreadsheet exports, BI dashboards that expect a "cube" shape, reports formatted for non-technical readers.
  • Why analysts and stakeholders prefer it — visually compact; "looks like a pivot table"; trivially cross-readable left-to-right.
  • The output shape that PIVOT produces and that UNPIVOT consumes.

When to pivot (long → wide).

  • The downstream consumer (Excel, a sheet, a static report) expects a pivot-table shape.
  • The metric list is small, stable, and known in advance (Q1 / Q2 / Q3 / Q4; not "all sub-products since 2018").
  • The number of pivot columns is bounded — wide tables with thousands of columns are anti-patterns.

When to unpivot (wide → long).

  • Upstream system emits wide data (a CSV from finance with one column per month) and you need to load it into a long fact table.
  • A model expects tidy data and the source is wide.
  • Joining two wide tables on a (entity, metric) grain — unpivot to long, join, then pivot back if needed.

Why long is the canonical storage shape for data engineering.

  • Adding a new metric is INSERT … VALUES (id, 'new_metric', val) — no schema change.
  • Adding a new entity is INSERT — no schema change.
  • Schema stability — long tables typically have 3–6 columns and never change; wide tables grow unbounded.
  • Aggregations and joins are cleaner on long tables — GROUP BY metric_name partitions naturally.

The reshape arrows.

  • Long → Wide = PIVOT (native operator) OR SUM(CASE WHEN …) (portable idiom).
  • Wide → Long = UNPIVOT (native operator) OR UNION ALL of SELECT entity, 'm1', m1 FROM t UNION ALL SELECT entity, 'm2', m2 FROM t … (portable idiom).

SQL
Topic — unpivoting
Unpivoting SQL drills

Practice →

SQL
Topic — data-aggregation
Data aggregation library

Practice →


3. The portable SUM(CASE WHEN …) pivot — works on every dialect

Diagram of the SUM(CASE WHEN …) pivot anatomy — a single long-format row with columns (region, quarter, amount) on the left, a purple branching arrow into four CASE WHEN expressions in the middle, and the wide-format row on the right with four metric columns (q1, q2, q3, q4) filled in, on a light PipeCode card.

sql pivot without pivot — the universal pivot idiom every data engineer should know

The portable pivot pattern works on every SQL dialect ever shipped, predates the native PIVOT keyword by decades, and is the only option in PostgreSQL, MySQL, SQLite, and most cloud warehouses without a dedicated operator. It is the sql pivot example every interviewer expects you to know.

The anatomy of the portable pivot.

  • GROUP BY <row_key> — defines the output grain; one row per distinct value of the row key.
  • One SUM(CASE WHEN <col_key> = '<value>' THEN <metric> ELSE 0 END) per known column value — each populates one wide-format column.
  • ELSE 0 keeps the aggregate at 0 instead of NULL for missing combinations.
  • COUNT(CASE WHEN cond THEN 1 END) for count-pivots (ELSE can be omitted because COUNT skips NULL).

The template.

SELECT <row_key>,
       SUM(CASE WHEN <col_key> = '<value_1>' THEN <metric> ELSE 0 END) AS <alias_1>,
       SUM(CASE WHEN <col_key> = '<value_2>' THEN <metric> ELSE 0 END) AS <alias_2>,
       
FROM <table>
GROUP BY <row_key>
ORDER BY <row_key>;
Enter fullscreen mode Exit fullscreen mode

Why the portable idiom is preferred even when PIVOT exists.

  • Portability — the same query works on Postgres, MySQL, SQL Server, Snowflake, BigQuery; native PIVOT is a translation step.
  • Multiple aggregates per pivot column — native PIVOT typically allows one aggregate per call; SUM(CASE WHEN …) lets you compute SUM and COUNT in the same query.
  • Custom predicates per cellWHEN amount > 100 AND quarter = 'Q1' is trivial in CASE; awkward in native PIVOT.
  • No IN (…) list constraint — native PIVOT requires the value list inline; CASE WHEN accepts any expression.

ELSE 0 discipline — the Blog72 callback.

  • Without ELSE 0SUM returns NULL for groups where no row matches the WHEN condition; downstream arithmetic propagates the NULL and undercounts.
  • With ELSE 0 — the missing-data case is explicitly zero; pivoted dashboards render 0, not blank.
  • Senior signal — stating "I use ELSE 0 to avoid the silent-NULL undercount" is the pattern from Blog72 applied to pivot cells.

Multiple metrics in one pivot query.

  • Combine SUM and COUNT in the same row by adding more CASE WHEN columns.
  • Example — per region, both total revenue per quarter AND order count per quarter:
SELECT region,
       SUM(CASE WHEN quarter = 'Q1' THEN amount ELSE 0 END) AS q1_rev,
       SUM(CASE WHEN quarter = 'Q2' THEN amount ELSE 0 END) AS q2_rev,
       COUNT(CASE WHEN quarter = 'Q1' THEN 1 END)           AS q1_orders,
       COUNT(CASE WHEN quarter = 'Q2' THEN 1 END)           AS q2_orders
FROM sales
GROUP BY region;
Enter fullscreen mode Exit fullscreen mode

Filtering inside the cell with FILTER (Postgres 9.4+).

  • Postgres-specific cleaner syntaxSUM(amount) FILTER (WHERE quarter = 'Q1') AS q1_rev.
  • Equivalent to SUM(CASE WHEN quarter = 'Q1' THEN amount END).
  • More readable for one-off pivots; not portable to MySQL / SQL Server.

SQL interview question — paid vs refunded revenue per region in a wide layout

Assume orders(region, status, amount) with status ∈ {'paid', 'refund', 'pending'}. Return one row per region with three columns — paid_rev, refund_amount, pending_amount — totalled across all orders.

Solution Using SUM(CASE WHEN …)

Detailed explanation. This is a textbook conditional-aggregation pivot from Blog72 applied to a region × status grain. The row key is region; the column key is status with three known values; the cell metric is SUM(amount). The ELSE 0 keeps regions with no refunds rendering 0, not NULL.

Code (PostgreSQL / works on every dialect).

SELECT region,
       SUM(CASE WHEN status = 'paid'    THEN amount ELSE 0 END) AS paid_rev,
       SUM(CASE WHEN status = 'refund'  THEN amount ELSE 0 END) AS refund_amount,
       SUM(CASE WHEN status = 'pending' THEN amount ELSE 0 END) AS pending_amount
FROM orders
GROUP BY region
ORDER BY paid_rev DESC;
Enter fullscreen mode Exit fullscreen mode

Equivalent using Postgres FILTER clause.

SELECT region,
       SUM(amount) FILTER (WHERE status = 'paid')    AS paid_rev,
       SUM(amount) FILTER (WHERE status = 'refund')  AS refund_amount,
       SUM(amount) FILTER (WHERE status = 'pending') AS pending_amount
FROM orders
GROUP BY region
ORDER BY paid_rev DESC;
Enter fullscreen mode Exit fullscreen mode

Step-by-step trace.

step relation outcome
1 Scan orders every order row read
2 GROUP BY region hash aggregate; one row per region
3 Three SUM(CASE WHEN …) per group three cell totals per row
4 ELSE 0 ensures missing-status cells become 0 no NULL holes
5 ORDER BY paid_rev DESC top-paying regions first

Output:

region paid_rev refund_amount pending_amount
US 11,200 1,300 0
EU 7,900 850 200
APAC 5,100 300 0

Why this works — concept by concept:

  • GROUP BY region defines the grain — one row per region in the output.
  • Three CASE WHEN columns map the three statuses — each populates one wide-format column.
  • ELSE 0 keeps cells zero, not NULL — the Blog72 silent-undercount defence applied to pivot cells.
  • FILTER syntax is the Postgres-clean equivalent — drops the CASE ceremony for the same semantics.
  • One pass through the data — the planner reads orders once and evaluates all three aggregates in the same hash node.
  • CostΘ(n) scan + Θ(n) hash aggregate keyed on region; the three aggregate calls are a constant per-row factor.

SQL
Topic — conditional-aggregation
Pivot via conditional aggregation

Practice →

SQL
Topic — case-when
CASE WHEN SQL library

Practice →


4. Native PIVOT syntax — SQL Server, Oracle, Snowflake

Diagram of native PIVOT syntax across three dialects — SQL Server, Oracle, and Snowflake — shown as three side-by-side code cards highlighting the FOR … IN (…) clause and the differing argument shapes, on a light PipeCode card.

pivot in sql — three dialects with a dedicated keyword

SQL Server, Oracle, and Snowflake ship a native PIVOT operator. It is syntactic sugar over the same SUM(CASE WHEN …) semantics — the planner often produces identical execution plans — but the syntax is concise enough that interviewers like to test whether you can read it. PostgreSQL and MySQL do not have native PIVOT; the portable idiom from §3 is mandatory there.

SQL Server PIVOT — the canonical shape.

SELECT region, [Q1], [Q2], [Q3], [Q4]
FROM (
    SELECT region, quarter, amount FROM sales
) src
PIVOT (
    SUM(amount)
    FOR quarter IN ([Q1], [Q2], [Q3], [Q4])
) AS pvt
ORDER BY region;
Enter fullscreen mode Exit fullscreen mode
  • PIVOT (aggregate FOR column IN (value_list)) — one aggregate, one column to pivot, an explicit list of values.
  • Bracketed identifiers[Q1] is the SQL Server way to quote column names with special characters or reserved words.
  • The derived table src selects only the columns participatingregion (row key), quarter (pivot column), amount (cell value). Any extra columns become extra row-key components.

Oracle PIVOT — same idea, slight syntax tweak.

SELECT * FROM (
    SELECT region, quarter, amount FROM sales
)
PIVOT (
    SUM(amount)
    FOR quarter IN ('Q1' AS q1, 'Q2' AS q2, 'Q3' AS q3, 'Q4' AS q4)
)
ORDER BY region;
Enter fullscreen mode Exit fullscreen mode
  • AS alias clauses produce nicer column names than 'Q1'.
  • String literals for the value list (Oracle uses single quotes).
  • Same FOR … IN (…) shape; same one-aggregate-per-call rule.

Snowflake PIVOT — close to SQL Server.

SELECT * FROM (
    SELECT region, quarter, amount FROM sales
) src
PIVOT (
    SUM(amount)
    FOR quarter IN ('Q1', 'Q2', 'Q3', 'Q4')
) AS pvt
ORDER BY region;
Enter fullscreen mode Exit fullscreen mode
  • String literals for the value list (no brackets).
  • PIVOT (... FOR ... IN (... ANY)) since 2023 — dynamic value list (computed at query time).
  • One aggregate per PIVOT — same constraint as SQL Server / Oracle.

BigQuery PIVOT — closest to the standard.

SELECT * FROM (
    SELECT region, quarter, amount FROM sales
)
PIVOT (
    SUM(amount)
    FOR quarter IN ('Q1', 'Q2', 'Q3', 'Q4')
);
Enter fullscreen mode Exit fullscreen mode
  • No alias required after the PIVOT (...) clause.
  • String literals, same FOR / IN shape.

The constraints native PIVOT shares across dialects.

  • One aggregate per PIVOT call — to compute both SUM and COUNT, run two queries and join them, or use the portable SUM(CASE WHEN …) idiom.
  • Values must be enumerated at write time — dynamic value lists require dynamic SQL or warehouse-specific extensions (Snowflake's ANY).
  • Implicit GROUP BY — every column from the derived table that is not the pivot column or the aggregated metric becomes a row-key column.
  • The implicit-grouping trap — if your derived table accidentally projects an extra column (e.g. order_id), each row of order_id becomes its own row key and the pivot collapses to one row per (region, order_id) instead of per region.

Dialect support matrix.

Dialect Native PIVOT Native UNPIVOT Portable SUM(CASE WHEN …)
PostgreSQL No No Yes — SUM(amount) FILTER (WHERE …) is the clean form
MySQL No No Yes
SQL Server Yes (since 2005) Yes Yes
Oracle Yes (since 11g) Yes Yes
Snowflake Yes Yes Yes
BigQuery Yes (since 2022) Yes Yes
Databricks Yes Yes Yes

SQL
Topic — case-when
PIVOT syntax SQL drills

Practice →

SQL
Topic — aggregation
Aggregation + PIVOT patterns

Practice →


5. UNPIVOT and the inverse reshape

Diagram of UNPIVOT — left card shows a wide table with columns (region, q1, q2, q3, q4) and two rows; a green UNPIVOT arrow expands it into the right card with columns (region, quarter, amount) and eight rows; small annotations highlight the column-name column and the cell-value column produced by UNPIVOT, on a light PipeCode card.

unpivot in sql — wide back to long for warehouse loading

sql unpivot is the inverse of sql pivot: a wide table with N metric columns becomes a long table with N × row_count rows. Every cell becomes a row, with the source column name as a new column and the cell value as another. The shape is exactly what warehouse fact tables and tidy-data dashboards consume — and unpivot in sql shows up in interviews as the "wide-to-long" reshape twin of the pivot prompt.

Native UNPIVOT — SQL Server, Oracle, Snowflake.

SELECT region, quarter, amount
FROM (
    SELECT region, q1, q2, q3, q4 FROM quarterly_sales
) src
UNPIVOT (
    amount FOR quarter IN (q1, q2, q3, q4)
) AS unpvt;
Enter fullscreen mode Exit fullscreen mode
  • UNPIVOT (value_col FOR name_col IN (col_list))value_col holds the cell values; name_col holds the source column names.
  • One row per source cell — four source columns × two regions = eight rows.
  • NULL cells are dropped by default — if q3 IS NULL for some region, that row does not appear. UNPIVOT … INCLUDE NULLS (some dialects) keeps them.

Portable UNPIVOT via UNION ALL — every dialect.

SELECT region, 'Q1' AS quarter, q1 AS amount FROM quarterly_sales
UNION ALL
SELECT region, 'Q2',           q2          FROM quarterly_sales
UNION ALL
SELECT region, 'Q3',           q3          FROM quarterly_sales
UNION ALL
SELECT region, 'Q4',           q4          FROM quarterly_sales;
Enter fullscreen mode Exit fullscreen mode
  • One SELECT per source column — each projects the same shape (region, quarter_literal, value_column).
  • UNION ALL stitches them vertically; never UNION (don't pay the dedup cost).
  • NULL handling is explicit — add WHERE q1 IS NOT NULL per leg if you want to drop NULL cells, or omit to keep them.

PostgreSQL LATERAL + VALUES — the elegant Postgres pattern.

SELECT region, quarter, amount
FROM quarterly_sales,
     LATERAL (VALUES
         ('Q1', q1),
         ('Q2', q2),
         ('Q3', q3),
         ('Q4', q4)
     ) AS t(quarter, amount);
Enter fullscreen mode Exit fullscreen mode
  • LATERAL lets the inline VALUES reference outer columns (q1, q2, etc.).
  • More compact than four UNION ALL legs.
  • PostgreSQL-only — not portable to MySQL or SQL Server, but very clean when available.

jsonb_each + to_jsonb — when columns are dynamic in Postgres.

SELECT region, kv.key AS quarter, (kv.value)::numeric AS amount
FROM quarterly_sales,
     LATERAL jsonb_each(to_jsonb(quarterly_sales) - 'region') kv;
Enter fullscreen mode Exit fullscreen mode
  • to_jsonb(row) - 'region' turns the row into a JSON object and drops the row key.
  • jsonb_each expands the object into key-value rows.
  • Use case — unpivoting tables with unknown or wildly varying wide columns; trades type safety for flexibility.

Common sql unpivot use cases — when unpivot in sql is the right reshape.

  • Loading wide CSVs into a long fact table — finance dumps a sheet with one column per month; the warehouse ETL unpivots it before insertion.
  • Reshaping for Looker / Tableau / Metabase — these tools strongly prefer tidy long data.
  • Computing aggregates across metricsMAX(amount) across all quarters per region is trivial once the data is long.

SQL interview question — unpivot wide quarterly_sales into long

Assume quarterly_sales(region, q1, q2, q3, q4). Return a long-format result with columns (region, quarter, amount) — one row per (region, quarter) cell.

Solution Using UNION ALL (portable)

Detailed explanation. The portable unpivot pattern produces one SELECT per source column, then stacks them with UNION ALL. Use UNION ALL (not UNION) because every row is unique by construction — no dedup pass needed.

Code (PostgreSQL / works on every dialect).

SELECT region, 'Q1' AS quarter, q1 AS amount FROM quarterly_sales
UNION ALL
SELECT region, 'Q2',           q2          FROM quarterly_sales
UNION ALL
SELECT region, 'Q3',           q3          FROM quarterly_sales
UNION ALL
SELECT region, 'Q4',           q4          FROM quarterly_sales
ORDER BY region, quarter;
Enter fullscreen mode Exit fullscreen mode

Equivalent in SQL Server (native UNPIVOT).

SELECT region, quarter, amount
FROM (
    SELECT region, q1, q2, q3, q4 FROM quarterly_sales
) src
UNPIVOT (
    amount FOR quarter IN (q1, q2, q3, q4)
) AS unpvt
ORDER BY region, quarter;
Enter fullscreen mode Exit fullscreen mode

Equivalent in Postgres using LATERAL VALUES.

SELECT region, quarter, amount
FROM quarterly_sales,
     LATERAL (VALUES
         ('Q1', q1), ('Q2', q2), ('Q3', q3), ('Q4', q4)
     ) AS t(quarter, amount)
ORDER BY region, quarter;
Enter fullscreen mode Exit fullscreen mode

Step-by-step trace.

step relation outcome
1 Four SELECTs each project (region, 'Qx', qx) four streams of equal shape
2 UNION ALL stacks them vertically one combined stream of N × 4 rows
3 Each row carries the source column name and the cell value tidy long shape
4 ORDER BY region, quarter sorted final output

Output:

region quarter amount
EU Q1 800
EU Q2 950
EU Q3 1100
EU Q4 1300
US Q1 1200
US Q2 1500
US Q3 1700
US Q4 2000

Why this works — concept by concept:

  • UNION ALL stacks the four projections — every row is unique by construction; no dedup needed.
  • String literal in each SELECT'Q1' AS quarter injects the source column name as data; this is the unpivot trick.
  • The cell value column comes from the corresponding wide columnq1, q2, q3, q4 each populate the same output column.
  • Native UNPIVOT is syntactic sugar — produces the same plan on every engine that supports it.
  • LATERAL VALUES is the Postgres-clean variant — one row per source row times the number of VALUES tuples.
  • Cost — four scans of quarterly_sales (the portable form) or one scan (native UNPIVOT / LATERAL); the native forms are cheaper at scale.

SQL
Topic — unpivoting
UNPIVOT SQL drills

Practice →

SQL
Topic — set-operations
UNION ALL + UNPIVOT patterns

Practice →


6. Dynamic pivots — when the column list is not known at write time

The dynamic-pivot problem — and why it's a senior interview signal

The portable and native pivot patterns both require the column list to be enumerated at write time: Q1 / Q2 / Q3 / Q4 is fine, but product_a / product_b / … / product_2000_new_skus is not. When the pivot values are unknown at write time, you need either dynamic SQL, a warehouse extension (Snowflake's PIVOT (… IN ANY)), or a downstream reshape in pandas / Spark.

The fundamental constraint.

  • Relational SQL is schema-boundSELECT projects a fixed number of named columns; you cannot project N columns where N depends on data values.
  • Native PIVOT (… IN (?, ?, …)) still needs the value list at parse time.
  • SUM(CASE WHEN col = 'X' THEN val ELSE 0 END) still needs 'X' as a literal at write time.

Three approaches when the value list is dynamic.

Approach 1 — dynamic SQL (the classic workaround).

  • Step 1SELECT DISTINCT pivot_col FROM source to discover the value list.
  • Step 2 — concatenate the value list into a SQL string with one SUM(CASE WHEN …) per value.
  • Step 3EXECUTE (SQL Server's EXEC sp_executesql, Postgres EXECUTE, Oracle EXECUTE IMMEDIATE).
  • Cost — two database round-trips; SQL string concatenation is hand-rolled; injection risk if user-supplied data is mixed in unsafely.
-- SQL Server example
DECLARE @cols NVARCHAR(MAX), @sql NVARCHAR(MAX);
SELECT @cols = STRING_AGG(QUOTENAME(quarter), ',')
FROM (SELECT DISTINCT quarter FROM sales) q;

SET @sql = N'
SELECT region, ' + @cols + N'
FROM (SELECT region, quarter, amount FROM sales) src
PIVOT (SUM(amount) FOR quarter IN (' + @cols + N')) AS pvt;';

EXEC sp_executesql @sql;
Enter fullscreen mode Exit fullscreen mode

Approach 2 — warehouse-native dynamic pivot (Snowflake's ANY).

  • Snowflake since 2023 — PIVOT (SUM(amount) FOR quarter IN (ANY)) discovers the value list automatically.
  • BigQuery — limited dynamic-pivot support via EXECUTE IMMEDIATE for the same DDL-string approach.
  • Cleanest option when available; no string concatenation.

Approach 3 — reshape downstream in pandas / Spark / dbt.

  • Pull long-format data from SQL.
  • Reshape in the application layer — pandas df.pivot_table(index='region', columns='quarter', values='amount', aggfunc='sum').
  • Often the right call for production pipelines — keeps SQL simple and pushes dynamic-schema work to a tool designed for it.

When dynamic pivot is the wrong question.

  • The value list is "ever-growing" — pivoting on product SKUs, user IDs, etc. is almost always wrong; you'd produce a table with thousands of columns.
  • Re-evaluate — is the question really "give me a wide layout" or is it "give me a long table I can render as a pivot in BI"? — usually the latter.
  • Senior signal — push back on the requirement when the long shape is what the consumer actually needs.

The interview answer.

  • State the constraint clearly — "SQL needs to know the column list at write time."
  • Offer three paths — dynamic SQL, warehouse-native dynamic pivot, or downstream reshape.
  • Pick the right one for the context — reach for downstream reshape unless the schema is tightly bounded.
  • Avoid the trap — never agree to a "thousand-column pivot"; flag the schema-design problem.

SQL
Topic — conditional-aggregation
Dynamic pivot SQL drills

Practice →

SQL
Topic — data-aggregation
Data aggregation library

Practice →


7. PIVOT gotchas — NULL cells, non-unique keys, dynamic SQL hazards

The seven bugs interviewers test most often on PIVOT and UNPIVOT

Reshape operators have a small surface area but a long tail of edge cases that fail candidates. These are the seven gotchas reviewers test most often when pivot table in sql comes up in an interview.

Gotcha 1 — SUM returns NULL for missing cells without ELSE 0.

  • The bugSUM(CASE WHEN quarter = 'Q3' THEN amount END) (no ELSE) returns NULL when the region has no Q3 orders.
  • Symptom — dashboards render blank / instead of 0; downstream arithmetic propagates NULL.
  • Fix — add ELSE 0 in every cell aggregate, or wrap with COALESCE(…, 0).

Gotcha 2 — Implicit GROUP BY in native PIVOT includes every projected column.

  • The bug — accidentally projecting order_id into the derived table makes every row its own group; the pivot collapses to one row per (region, order_id) instead of per region.
  • Symptom — wide table with millions of rows of one-row-per-order, each with mostly empty cells.
  • Fix — project only the row-key columns, the pivot column, and the metric in the derived table.

Gotcha 3 — Pivot value list must match exactly (case + whitespace).

  • The bugFOR quarter IN ('q1', 'q2', 'q3', 'q4') when the data contains 'Q1', 'Q2', 'Q3', 'Q4'.
  • Symptom — every cell is NULL because no row matches.
  • Fix — normalise upstream (LOWER(quarter) or UPPER(quarter)) or list both casings.

Gotcha 4 — UNPIVOT drops NULL cells silently.

  • The behaviour — native UNPIVOT excludes rows where the cell is NULL; this is often desired but sometimes surprising.
  • SymptomUNPIVOT output has fewer rows than expected when the wide table has missing values.
  • FixUNPIVOT … INCLUDE NULLS if your dialect supports it, or use the portable UNION ALL form with explicit WHERE control.

Gotcha 5 — Non-unique row keys produce unexpected aggregates.

  • The bug — pivoting on a row key that is not unique in the source data (e.g. region but the same region appears multiple times per quarter) silently aggregates.
  • Symptom — cell values are larger than expected because multiple source rows feed each cell.
  • Fix — confirm the source grain before pivoting; pre-aggregate to the row-key × pivot-column grain inside a CTE.

Gotcha 6 — Native PIVOT allows only one aggregate per call.

  • The constraintPIVOT (SUM(amount) FOR …) and PIVOT (COUNT(*) FOR …) cannot coexist in a single PIVOT clause.
  • Workaround — run two PIVOTs and join on the row key, OR use the portable SUM(CASE WHEN …) + COUNT(CASE WHEN …) form in one query.
  • Performance — the portable form is often faster because it scans the source table once.

Gotcha 7 — Dynamic SQL pivots introduce injection risk.

  • The bug — concatenating user-supplied data into the value list string opens a SQL injection vector.
  • Symptom — security audit failure; potential data exfiltration.
  • Fix — use the dialect's safe quoting function (QUOTENAME in SQL Server, quote_literal in Postgres) to escape values; or restrict the discovery query to a known-safe source.

SQL
Topic — case-when
PIVOT gotcha drills

Practice →

SQL
Topic — unpivoting
UNPIVOT gotcha drills

Practice →


Choosing the right reshape (cheat sheet)

A one-screen cheat sheet for using SQL PIVOT and UNPIVOT — pick the direction, then the dialect-appropriate idiom.

You want to … Direction Portable idiom Native (when available)
Long → Wide with one aggregate PIVOT SUM(CASE WHEN col = 'X' THEN v ELSE 0 END) PIVOT (SUM(v) FOR col IN ('X', …))
Long → Wide with multiple aggregates PIVOT One SUM(CASE WHEN …) per metric per value Two native PIVOTs joined on the row key
Long → Wide with Postgres clean syntax PIVOT SUM(v) FILTER (WHERE col = 'X') n/a (Postgres has no native PIVOT)
Wide → Long UNPIVOT SELECT … UNION ALL SELECT … per source column UNPIVOT (val FOR name IN (col1, col2, …))
Wide → Long in Postgres (clean) UNPIVOT LATERAL (VALUES ('X', col1), ('Y', col2)) AS t(name, val) n/a
Dynamic column list PIVOT Dynamic SQL + string-concatenated CASE WHENs PIVOT (… IN ANY) (Snowflake 2023+)
Excel-style pivot table PIVOT SUM(CASE WHEN …) with a row key and column key dialect-native PIVOT
Reshape for warehouse load UNPIVOT UNION ALL of per-column projections dialect-native UNPIVOT
Reshape for Looker / Tableau UNPIVOT UNION ALL to long format tidy data favoured by BI tools
Count of rows per cell PIVOT COUNT(CASE WHEN col = 'X' THEN 1 END) PIVOT (COUNT(*) FOR col IN (…))
Both SUM and COUNT per cell PIVOT One query with two CASE-aggregate columns per value Two PIVOTs joined

Frequently asked questions

What is PIVOT in SQL?

PIVOT is a reshape operator that turns long-format rows into wide-format columns. The input is typically a table with three logical columns — a row key (which becomes the rows in the output), a column key (whose distinct values become the new columns), and a metric (whose aggregate populates each cell). For example, sales(region, quarter, amount) can be pivoted on quarter to produce one row per region with four columns (Q1, Q2, Q3, Q4) holding SUM(amount). SQL Server, Oracle, Snowflake, BigQuery, and Databricks have native PIVOT operators with a FOR … IN (…) syntax; PostgreSQL and MySQL have no native PIVOT and rely on the portable SUM(CASE WHEN col = 'X' THEN val ELSE 0 END) idiom inside a GROUP BY query.

How do I pivot rows to columns in PostgreSQL or MySQL?

PostgreSQL and MySQL have no native PIVOT keyword — use the portable SUM(CASE WHEN …) idiom. The template is SELECT row_key, SUM(CASE WHEN col_key = 'value_1' THEN metric ELSE 0 END) AS alias_1, … FROM source GROUP BY row_key. Add one CASE WHEN column per distinct pivot value, with ELSE 0 to keep missing cells as 0 instead of NULL (Blog72 silent-undercount defence). PostgreSQL 9.4+ also supports the cleaner SUM(amount) FILTER (WHERE quarter = 'Q1') syntax, which produces the same result without the CASE ceremony. The idiom is the only portable shape, predates the PIVOT keyword by decades, and is the right answer for any cross-engine codebase.

What's the difference between PIVOT and UNPIVOT?

PIVOT goes long → wide: one row per (entity, metric, value) triple becomes one row per entity with one column per distinct metric. UNPIVOT goes wide → long: one row per entity with N metric columns becomes N × row_count long rows, each carrying a metric name and its value. The two are inverses: PIVOT is what BI tools and stakeholders want for visual reports; UNPIVOT is what warehouse ETL pipelines want for tidy storage. Native UNPIVOT exists on SQL Server, Oracle, Snowflake, BigQuery, and Databricks; the portable form is a UNION ALL of one SELECT entity, 'metric_name' AS m, metric_value FROM source per source column. PostgreSQL has a particularly clean LATERAL (VALUES …) shape that beats both the native and portable forms for readability.

Can I pivot on a column with unknown values at write time?

Not directly with relational SQL — the PIVOT operator and the SUM(CASE WHEN …) idiom both require the value list to be enumerated at parse time. Three workarounds: (1) dynamic SQL — discover the distinct values, concatenate them into a SQL string, and EXECUTE the resulting query; verbose and injection-risky; (2) warehouse-native dynamic pivot — Snowflake's PIVOT (… IN ANY) discovers the value list at query time; the cleanest option when available; (3) reshape downstream — keep SQL queries simple and long-format, then pivot in pandas / Spark / dbt; usually the right production answer. Be wary of "ever-growing" pivot column lists (product SKUs, user IDs) — they signal a schema-design problem; the long shape is almost always what the consumer actually needs, even if they asked for wide.

Why use SUM(CASE WHEN …) instead of native PIVOT when both are available?

Three reasons. Portability — the same query works on Postgres, MySQL, SQL Server, Snowflake, BigQuery; native PIVOT is a per-dialect translation step. Multiple aggregates per pivot column — native PIVOT typically allows one aggregate per call, so computing both SUM and COUNT per cell requires two PIVOTs and a join; SUM(CASE WHEN …) + COUNT(CASE WHEN …) does it in one query with one scan. Custom predicates per cellWHEN amount > 100 AND quarter = 'Q1' is trivial in CASE; awkward in native PIVOT's IN (…) list. The portable idiom is preferred even on dialects with native PIVOT for any non-trivial pivot; native PIVOT is best reserved for read-only ad-hoc reports where the syntax brevity matters.


Practice on PipeCode

PipeCode ships 450+ data-engineering interview problems — including PostgreSQL-first SQL practice keyed to PIVOT / UNPIVOT, conditional aggregation with SUM(CASE WHEN …), the FILTER clause, set-operations for UNPIVOT via UNION ALL, and the dialect quirks that fail candidates who memorise only one engine. Whether you're drilling sql interview questions with answers for sql for data engineers loops or grinding through data engineering interview questions end-to-end, the practice library mirrors the same long-vs-wide mental model this guide teaches.

Kick off via Explore practice →; drill the CASE WHEN SQL lane →; fan out into the unpivoting lane →; rehearse conditional aggregation problems →; reinforce aggregation SQL drills →; widen coverage on the full SQL practice library →.

Top comments (0)