Gowtham Potureddi

Posted on May 3

COALESCE in SQL — First Non-NULL, LEFT JOIN Defaults, and Interview Patterns

#python #sql #interview #dataengineering

COALESCE in SQL is the single most-asked NULL-handling primitive at every data-engineering interview that touches production analytics. The mental model: COALESCE(expr1, expr2, ..., exprN) returns the first argument that is not NULL, evaluated left-to-right; if every argument is NULL, the result is NULL. That one rule covers a vast surface — fallback chains for "best available column," default values to neutralize LEFT JOIN misses, sentinel labels like 'NONE' for missing dimensions, and zero-substitution before downstream math. Four sub-primitives carry the loop: left-to-right evaluation with short-circuit semantics, the COALESCE(left_join_col, default) pattern that turns outer-join NULLs into reportable defaults, the dialect-portability matrix versus CASE / ISNULL / NVL / IFNULL, and the pitfall set around NULL ≠ 0 semantics, type coercion, empty strings, and the COALESCE(NULLIF(col, ''), 'default') composition.

This guide walks four concept clusters end-to-end, each with a detailed topic explanation, per-sub-topic explanation with a worked example and its solution, and an interview-style worked problem with a full solution that explains why it works. The mix matches the actual surface area COALESCE covers in production SQL — analytics dashboards, ETL transforms, BI reports, and whiteboard interview questions. Examples use PostgreSQL-friendly syntax; engine differences are called out where they bite (ISNULL is SQL Server, NVL is Oracle, IFNULL is MySQL — all of them are two-argument cousins of the standard-SQL COALESCE).

Top COALESCE concepts and SQL patterns

The four numbered sections below follow this concept map (one row per H2):

#	Topic (sections 1–4)	Why it matters in SQL data engineering
1	`COALESCE` evaluation order and basic fallback patterns	`COALESCE(a, b, c)` is left-to-right, short-circuits on first non-`NULL`, and follows engine-specific type-precedence rules; the bedrock primitive every other pattern composes on.
2	`COALESCE` with `LEFT JOIN` for default values in analytics and BI	Outer joins to lookup tables (FX rates, dimensions, slowly-changing-dim surrogates) produce `NULL` on misses; `COALESCE(fx.rate, 1)` and `COALESCE(p.promo_code, 'NONE')` turn that `NULL` into a default that downstream metrics can consume.
3	`COALESCE` vs `CASE`, `ISNULL`, and `NVL` — portability and when to pick which	`COALESCE` is standard SQL and portable across PostgreSQL / MySQL / SQL Server / Oracle / Snowflake / BigQuery; `ISNULL` (SQL Server, 2-arg), `NVL` (Oracle, 2-arg), `IFNULL` (MySQL, 2-arg) are dialect cousins; `CASE` is the right tool when the logic is not "first non-`NULL`."
4	`COALESCE` pitfalls — NULL semantics, type coercion, empty strings, and `NULLIF`	`NULL` ≠ `0`; replacing unknown with zero changes the metric meaning. Empty string `''` is not `NULL` in PostgreSQL; combine with `NULLIF` for "empty-or-null." Mixing types without `CAST` raises errors or silently coerces.

Concept-based framing rule: COALESCE is the primitive; the four sections are the operating contexts where it shows up. Master the evaluation rule first (§1), then the LEFT JOIN default pattern (§2), then the dialect comparison (§3), then the pitfalls (§4). State the left-to-right-first-non-NULL rule out loud in any interview answer that touches null handling — interviewers grade the explicit verbalization, not just the correct query.

1. `COALESCE` Evaluation Order and Basic Fallback Patterns

Left-to-right first-non-NULL evaluation in SQL for data engineering

"Return the first non-NULL value from a list of expressions" is the canonical COALESCE semantics. The mental model: the engine evaluates arguments left-to-right; the first one whose value is not NULL becomes the result; remaining arguments may not be evaluated at all (most engines short-circuit); if every argument is NULL, the result itself is NULL. Same primitive powers any "priority-ordered fallback" pattern — pick the user's work_email, else personal_email, else a literal 'no-email@example.com'; pick the trade's settled_price, else mark_price, else last_close; pick the column you trust, else the column you mostly trust, else a sentinel.

Pro tip: Even when the engine documents short-circuit evaluation, never put side-effecting expressions in later COALESCE arguments — volatile functions, RAISE statements, or sequence calls. State this explicitly in interviews; it signals production fluency. The reader-friendly contract is "every argument is a pure expression; the engine picks the first non-NULL."

Syntax: `COALESCE(expr1, expr2, ..., exprN)`

The syntax invariant: COALESCE takes 1 to N arguments (most engines require ≥ 2) and returns a single value of the unified result type. The unified type is determined by precedence rules across all argument types — mixing INT and BIGINT is fine and returns BIGINT; mixing INT and TEXT without an explicit CAST raises a type error in PostgreSQL.

COALESCE(a, b) — minimum form; first argument a if non-NULL, else b.
COALESCE(a, b, c, d, e) — N-ary form; ordered fallback chain.
COALESCE(NULL) — single-argument form; rejected by PostgreSQL parser.
Type unification — all arguments must coerce to a common type; explicit CAST(... AS ...) keeps intent obvious.

Worked example. Mixed literal arguments; the engine returns 'third' because the first two are NULL.

step	value
arg 1	`NULL`
arg 2	`NULL`
arg 3	`'third'` (first non-`NULL`)
result	`'third'`

Worked-example solution.

SELECT COALESCE(NULL, NULL, 'third', 'fourth') AS first_non_null;
-- first_non_null = 'third'

Rule of thumb: COALESCE reads as "try this, else try this, else try this" — the order of arguments encodes business priority directly in SQL.

Short-circuit evaluation and side-effect safety

The short-circuit invariant: most engines stop evaluating arguments once they find a non-NULL; PostgreSQL, SQL Server, Oracle, and MySQL all document this behavior, but the SQL standard does not require it. Treat short-circuit as a performance optimization, not a contract — never rely on it for correctness when later arguments have side effects.

COALESCE(safe_col, expensive_lookup()) — expensive_lookup() only runs when safe_col is NULL.
COALESCE(col, RAISE_NOTICE('...')) — never rely on the side-effect ordering; the optimizer may rewrite.
Volatile functions — random(), now(), sequence calls can be evaluated even when later in the list.
Pure functions only — write the contract as "every argument is a pure expression."

Worked example. Cheap column first, expensive subquery second.

input	expensive_lookup called?
`cached_value = 100`	no
`cached_value = NULL`	yes

Worked-example solution.

SELECT user_id,
       COALESCE(
         cached_email,                           -- cheap, indexed
         (SELECT email FROM email_lookup
          WHERE user_id = u.user_id LIMIT 1)    -- expensive, only runs on NULL cached
       ) AS effective_email
FROM users u;

Rule of thumb: put the cheapest reliably-non-NULL column first; let short-circuit save the expensive lookups for rows that need them.

Type precedence and `CAST` for unambiguous types

The type-precedence invariant: COALESCE returns a single unified type computed from the precedence rules of all argument types; mixing incompatible types (e.g. INTEGER with TEXT) is a parse error in PostgreSQL and a silent coercion in MySQL. Use CAST(... AS ...) to make intent explicit.

COALESCE(int_col, 0) — both INTEGER; result INTEGER. ✓
COALESCE(numeric_col, 0) — NUMERIC and INTEGER unify to NUMERIC. ✓
COALESCE(int_col, 'unknown') — type error in PostgreSQL.
COALESCE(CAST(int_col AS TEXT), 'unknown') — explicit cast resolves the conflict.

Worked example. Avoid the type error by casting INTEGER to TEXT before mixing with a string default.

expression	result type	safe?
`COALESCE(qty, 'N/A')`	error	✗
`COALESCE(CAST(qty AS TEXT), 'N/A')`	TEXT	✓

Worked-example solution.

SELECT order_id,
       COALESCE(CAST(qty AS TEXT), 'N/A') AS qty_label
FROM orders;

Rule of thumb: when you need a string sentinel for a numeric column, cast the numeric first; never trust implicit coercion across families.

Common beginner mistakes

Calling COALESCE with one argument — most engines reject; minimum is 2.
Putting volatile / side-effecting expressions in later positions — non-deterministic results.
Mixing types without CAST — silent coercion in MySQL, parse error in PostgreSQL.
Assuming all engines short-circuit — relying on it is a portability bug.
Returning NULL and surprising downstream — if every argument is NULL, so is the result; add a guaranteed-non-NULL literal at the end if the metric needs a default.

Worked Problem on COALESCE Evaluation Order

Given a contacts table with three nullable email columns — work_email, personal_email, legacy_email — return one column effective_email per row that is the first non-NULL of the three, falling back to the literal 'no-email@example.com' when every column is NULL.

Solution Using `COALESCE` with a literal final fallback

SELECT
    contact_id,
    COALESCE(
        work_email,
        personal_email,
        legacy_email,
        'no-email@example.com'
    ) AS effective_email
FROM contacts;

Why this works: the four-argument COALESCE evaluates each column left-to-right; on the first non-NULL it short-circuits and returns that value; the trailing literal 'no-email@example.com' guarantees a non-NULL result for rows where all three columns are NULL. One pass, no CASE, no joins, fully PostgreSQL-portable.

Step-by-step trace for four sample rows:

contact_id	work_email	personal_email	legacy_email	effective_email
1	`alice@work.com`	`alice@home.com`	`NULL`	`alice@work.com`
2	`NULL`	`bob@home.com`	`bob@old.net`	`bob@home.com`
3	`NULL`	`NULL`	`carol@old.net`	`carol@old.net`
4	`NULL`	`NULL`	`NULL`	`no-email@example.com`

Row 1 — work_email is non-NULL → return immediately.
Row 2 — work_email is NULL; personal_email is non-NULL → return.
Row 3 — first two are NULL; legacy_email is non-NULL → return.
Row 4 — all three are NULL; literal 'no-email@example.com' returned.

Output:

contact_id	effective_email
1	alice@work.com
2	bob@home.com
3	carol@old.net
4	no-email@example.com

Why this works — concept by concept:

Left-to-right evaluation — COALESCE walks arguments in source order; the priority encoded in the SQL exactly matches the business rule "prefer work, then home, then legacy."
Short-circuit on first non-NULL — the engine stops as soon as it has a value; rows with a populated work_email never touch personal_email or legacy_email.
Literal final fallback — the trailing string literal guarantees effective_email is never NULL; downstream consumers (reports, joins, filters) can treat the column as NOT NULL.
Single-pass aggregation — no CASE, no JOIN, no subquery; the SELECT scans contacts once and emits one output row per input row.
O(N) time / O(1) space — N rows scanned, no auxiliary structures; the per-row work is constant 4-argument evaluation.

Inline CTA: Drill the SQL null-handling practice page for the full curated set of COALESCE-style problems.

SQL
Topic — null handling
SQL null-handling problems

Practice →

SQL
Topic — conditional logic
SQL conditional-logic problems

Practice →

SQL
Topic — filtering
SQL filtering problems

Practice →

2. `COALESCE` with `LEFT JOIN` for Default Values in Analytics and BI

COALESCE for outer-join NULL handling in SQL for data engineering

"Replace NULL from a LEFT JOIN miss with a sensible default" is the most common production use of COALESCE. The mental model: LEFT JOIN keeps every row from the left table; rows that have no match in the right table get NULL for every right-table column; COALESCE(right_col, default) neutralizes those NULLs into business-meaningful defaults — 1 for missing FX rates (treat as USD), 'NONE' for missing promo codes, 0 for missing aggregate counts. Same primitive powers any "fact + lookup" pipeline — sales fact joined to currency dim, click event joined to user dim, transaction joined to merchant category.

Pro tip: Choose the default carefully. COALESCE(fx.rate, 1) is correct only when the source amount is already in USD; if amounts are in unknown local currencies, defaulting to 1 silently fabricates dollars. State the assumption in the SQL comment or the surrounding documentation.

`LEFT JOIN` + `COALESCE(right_col, default)` for fact-dim joins

The fact-dim invariant: a fact table (transactions, events, orders) is LEFT JOIN-ed to a dim table (currency rates, customer reference, product catalog); rows with no dim match return NULL for every dim column; COALESCE substitutes a default that downstream math can consume.

COALESCE(dim_col, default) — the canonical wrapping pattern.
LEFT JOIN — keep all fact rows; never drop unmatched.
Right-side filtering — putting WHERE dim.col = X after a LEFT JOIN silently turns it into INNER JOIN (the WHERE filters out the NULL rows); use AND dim.col = X inside the ON clause if you need to keep all fact rows.
Default semantics — pick a value that means "no match" without poisoning the metric.

Worked example. Three transactions in two currencies; FX table has rates only for EUR.

transaction	currency	fx.rate (after LEFT JOIN)	coalesced rate
t1	USD	`NULL`	1
t2	EUR	`1.08`	1.08
t3	USD	`NULL`	1

Worked-example solution.

SELECT t.transaction_id,
       t.amount,
       t.currency,
       COALESCE(fx.rate, 1) * t.amount AS amount_usd
FROM transactions t
LEFT JOIN fx_rates fx
    ON fx.currency = t.currency;

Rule of thumb: LEFT JOIN ... COALESCE(right_col, default) is the canonical "fact + dim with safe defaults" pattern; it appears in nearly every BI dashboard SQL.

"First non-blank across sources" via chained `COALESCE`

The chained-coalesce invariant: multiple LEFT JOINs to supplemental tables (primary, secondary, fallback) return per-table columns that may all be NULL; COALESCE(a.col, b.col, c.col, default) picks the first source that actually has data.

Primary source — most reliable column (e.g., production CRM).
Secondary source — supplemental (e.g., marketing CRM).
Tertiary source — historical (e.g., legacy import).
Default literal — guaranteed non-NULL final fallback.

Worked example. User name from CRM, fall back to billing system, fall back to login email.

crm.name	billing.name	login.email	display_name
`Alice Lee`	`Alice L.`	`alice@x.com`	`Alice Lee`
`NULL`	`Bob B.`	`bob@x.com`	`Bob B.`
`NULL`	`NULL`	`carol@x.com`	`carol@x.com`
`NULL`	`NULL`	`NULL`	`Unknown`

Worked-example solution.

SELECT u.user_id,
       COALESCE(crm.full_name, billing.display_name, login.email, 'Unknown') AS display_name
FROM users u
LEFT JOIN crm     ON crm.user_id     = u.user_id
LEFT JOIN billing ON billing.user_id = u.user_id
LEFT JOIN login   ON login.user_id   = u.user_id;

Rule of thumb: chain LEFT JOINs for multiple sources, then COALESCE across them in priority order; the SQL reads top-to-bottom exactly like the business rule.

`COALESCE(SUM(x), 0)` for grouped aggregates with empty groups

The aggregate-coalesce invariant: SUM, COUNT, AVG over an empty group returns NULL; wrapping the aggregate in COALESCE(SUM(x), 0) returns a numeric default so the report row still appears with a zero metric instead of disappearing.

SUM(x) over empty group — returns NULL.
COUNT(*) — returns 0, not NULL (the exception).
COALESCE(SUM(x), 0) — 0 for empty groups; report row remains.
COALESCE(AVG(x), 0) — 0 for empty groups; semantic consideration — may mislead.

Worked example. Three regions; only two have orders; the third should show 0 not be omitted.

region	SUM(amount)	COALESCE
North	1500	1500
South	800	800
West	NULL	0

Worked-example solution.

SELECT r.region,
       COALESCE(SUM(o.amount), 0) AS total_amount
FROM regions r
LEFT JOIN orders o ON o.region = r.region
GROUP BY r.region;

Rule of thumb: always wrap SUM / AVG in COALESCE(..., 0) when joining a dimension that should produce a row even with zero matching facts; without it, empty groups silently disappear.

Common beginner mistakes

Filtering WHERE dim.col = X after a LEFT JOIN — silently downgrades to INNER JOIN; move to AND dim.col = X in the ON clause.
Defaulting FX rate to 1 when the source amount is in unknown currency — fabricates dollars.
Forgetting COALESCE(SUM(x), 0) for aggregate metrics — empty groups disappear from reports.
Using COALESCE to skip rows — that's WHERE's job, not COALESCE's.
Mismatched join keys — COALESCE cannot fix a missing or wrong ON predicate.

Worked Problem on LEFT JOIN with COALESCE Default

Given an orders table and an optional promos table, return a single column promo_label per order that shows the promo code when matched and the literal 'NONE' when there is no matching promo.

Solution Using `LEFT JOIN` + `COALESCE(p.promo_code, 'NONE')`

SELECT
    o.order_id,
    COALESCE(p.promo_code, 'NONE') AS promo_label
FROM orders o
LEFT JOIN promos p ON p.promo_id = o.promo_id;

Why this works: LEFT JOIN keeps every order, even those with no promo_id or whose promo_id doesn't match any row in promos; the unmatched rows get NULL for p.promo_code; COALESCE substitutes the literal 'NONE' so the output column is non-NULL and dashboard-safe.

Step-by-step trace for sample data:

order_id	promo_id	matched promo_code	COALESCE result
1001	P1	`SUMMER20`	`SUMMER20`
1002	NULL	(no row)	`NONE`
1003	P9	(no row)	`NONE`
1004	P2	`WELCOME10`	`WELCOME10`

LEFT JOIN — every order row survives.
Match — order 1001 → P1 → SUMMER20; order 1004 → P2 → WELCOME10.
No-match — orders 1002 (NULL key) and 1003 (key not in promos) get NULL for p.promo_code.
COALESCE — NULL rows become 'NONE'; matched rows pass through.

Output:

order_id	promo_label
1001	SUMMER20
1002	NONE
1003	NONE
1004	WELCOME10

Why this works — concept by concept:

LEFT JOIN preserves left-side rows — every order stays in the result; no order silently disappears because of a missing or invalid promo_id.
Right-side NULL on miss — orders with no matching promo see p.promo_code = NULL; this is the LEFT JOIN contract, not a bug.
COALESCE(p.promo_code, 'NONE') — sentinel substitution; the output column becomes non-NULL and downstream code can treat it as a finite enum (SUMMER20, WELCOME10, ..., NONE).
Sentinel, not silent zero — 'NONE' is a string sentinel that flags "no promo applied"; it is never confused with a real code.
O(|orders| + |promos|) time — hash-join cost; O(|orders|) output rows.

Inline CTA: Practice SQL join problems to drill the LEFT JOIN + COALESCE muscle.

SQL
Topic — joins
SQL join problems

Practice →

SQL
Topic — null handling
SQL null-handling problems

Practice →

SQL
Topic — aggregation
SQL aggregation problems

Practice →

3. `COALESCE` vs `CASE`, `ISNULL`, and `NVL` — Portability and When to Pick Which

Standard SQL portability versus dialect-specific NULL helpers in SQL for data engineering

"Which NULL-handling construct should I use in this engine?" is a question every SQL data engineer answers daily. The mental model: COALESCE is the standard-SQL N-argument primitive supported on every major engine; ISNULL, NVL, and IFNULL are dialect-specific 2-argument cousins that exist for historical reasons; CASE WHEN ... IS NOT NULL THEN ... ELSE ... END is the general-purpose conditional that handles any logic, not just NULL fallback. Reach for COALESCE by default; reach for CASE when the branches are not "first non-NULL."

Pro tip: In multi-engine analytics codebases (e.g., dbt models targeting both Snowflake and BigQuery, or migrations from legacy SQL Server to PostgreSQL), COALESCE is the only portable choice. Hard-coding ISNULL or NVL in shared code is a portability bug waiting for the migration ticket.

`COALESCE` (standard SQL) — the portable default

The portability invariant: COALESCE is part of the SQL:1992 standard and is supported by PostgreSQL, MySQL, SQL Server, Oracle, Snowflake, BigQuery, Redshift, Databricks SQL, DuckDB, SQLite, and every other major engine you'll encounter. It accepts 2 to N arguments.

N-argument support — COALESCE(a, b, c, d, ...) works everywhere.
Standard-SQL identifier — no version-specific syntax.
Recommended default — choose COALESCE unless you have a dialect-specific reason.
Most engines short-circuit — performance-equivalent to dialect cousins.

Worked example. Same query runs unmodified on PostgreSQL, MySQL, SQL Server, Oracle, BigQuery.

dialect	query	works?
PostgreSQL	`SELECT COALESCE(a, b, c) FROM t;`	✓
MySQL	same	✓
SQL Server	same	✓
Oracle	same	✓
BigQuery	same	✓

Worked-example solution.

SELECT user_id,
       COALESCE(work_email, personal_email, 'no-email@example.com') AS contact
FROM users;
-- runs identically across PostgreSQL / MySQL / SQL Server / Oracle / BigQuery / Snowflake

Rule of thumb: if portability matters at all, COALESCE is the only correct choice; it never gets you locked into a single dialect.

`ISNULL` (SQL Server), `NVL` (Oracle), `IFNULL` (MySQL) — dialect 2-arg cousins

The dialect-cousin invariant: all three are 2-argument NULL-replacement functions; they exist for historical reasons predating standard COALESCE adoption; they have minor type-coercion quirks vs COALESCE that occasionally bite (e.g., SQL Server ISNULL truncates to the type of the first argument, while COALESCE uses precedence).

function	engine	arity	type rule
`ISNULL(a, b)`	SQL Server	2	result type = type of `a` (truncation risk)
`NVL(a, b)`	Oracle	2	result type = type of `a`
`IFNULL(a, b)`	MySQL, BigQuery	2	result type = standard precedence

Worked example. SQL Server ISNULL truncation gotcha — ISNULL(varchar(3), 'longer string') truncates the second argument to 3 characters.

query	result	gotcha
`ISNULL(CAST('hi' AS VARCHAR(3)), 'goodbye')`	`'goo'`	truncated to len-3
`COALESCE(CAST('hi' AS VARCHAR(3)), 'goodbye')`	`'goodbye'`	no truncation

Worked-example solution.

-- SQL Server-only (avoid in portable code)
SELECT ISNULL(work_email, personal_email) FROM users;

-- Oracle-only
SELECT NVL(work_email, personal_email) FROM users;

-- MySQL / BigQuery
SELECT IFNULL(work_email, personal_email) FROM users;

-- Standard SQL — runs everywhere
SELECT COALESCE(work_email, personal_email) FROM users;

Rule of thumb: unless you are maintaining an existing single-dialect codebase that already uses the cousin, prefer COALESCE; the cousins offer no benefit and lock you into one engine.

`CASE WHEN ... IS NOT NULL THEN ... ELSE ... END` — when logic is not "first non-NULL"

The CASE invariant: CASE is the general-purpose conditional in SQL; it handles any boolean predicate, not just IS NOT NULL; reach for CASE when the branches involve ranges, flags, transformations, or any logic richer than "pick the first non-NULL".

Two-argument CASE equivalent to COALESCE(a, b) — verbose but explicit.
Range branches — CASE WHEN amount > 1000 THEN 'large' WHEN amount > 100 THEN 'medium' ELSE 'small' END.
Multiple-condition logic — combine IS NULL, comparisons, and string predicates.
Per-branch transformations — different SQL functions per branch.

Worked example. Categorize order amounts; COALESCE doesn't apply because the logic is range-based.

amount	category
1500	large
250	medium
50	small
NULL	unknown

Worked-example solution.

SELECT order_id,
       CASE
         WHEN amount IS NULL THEN 'unknown'
         WHEN amount > 1000 THEN 'large'
         WHEN amount > 100  THEN 'medium'
         ELSE 'small'
       END AS amount_category
FROM orders;
-- COALESCE cannot express ranges; CASE is the right tool

Rule of thumb: if your logic is "first non-NULL," use COALESCE; if your logic is anything else, use CASE; if you find yourself nesting COALESCE with CASE, the CASE alone is usually clearer.

Common beginner mistakes

Using ISNULL in PostgreSQL — ISNULL(a, b) is not a function in PostgreSQL; it's a keyword for IS NULL. Parse error.
Using NVL outside Oracle — non-portable; COALESCE instead.
Hitting the SQL Server ISNULL truncation gotcha — use COALESCE for safe type unification.
Reaching for CASE when COALESCE is shorter — verbose and harder to review.
Reaching for COALESCE when the logic is not first-non-NULL — pick CASE.

Worked Problem on Picking the Right NULL Construct

Given a users table with three nullable columns — work_email, personal_email, and phone — return a column effective_contact that is the first non-NULL of the three, prefixed by 'email: ' for emails and 'phone: ' for phone numbers, falling back to 'no contact'.

Solution Using `COALESCE` inside a `CASE`

SELECT
    user_id,
    CASE
        WHEN work_email IS NOT NULL OR personal_email IS NOT NULL
            THEN 'email: ' || COALESCE(work_email, personal_email)
        WHEN phone IS NOT NULL
            THEN 'phone: ' || phone
        ELSE 'no contact'
    END AS effective_contact
FROM users;

Why this works: the outer CASE handles the per-channel transformation (different prefix for email vs phone), which COALESCE alone cannot express; the inner COALESCE(work_email, personal_email) collapses the two email candidates with first-non-NULL priority, then the prefix concatenation runs only when at least one email is present; the WHEN phone IS NOT NULL branch handles the secondary channel; the ELSE 'no contact' is the guaranteed fallback. This is the canonical "use COALESCE for null pickers, CASE for everything else" idiom.

Step-by-step trace for four sample rows:

user_id	work_email	personal_email	phone	effective_contact
1	`a@w.com`	`a@h.com`	`5551111`	`email: a@w.com`
2	`NULL`	`b@h.com`	`5552222`	`email: b@h.com`
3	`NULL`	`NULL`	`5553333`	`phone: 5553333`
4	`NULL`	`NULL`	`NULL`	`no contact`

Row 1 — work_email non-null → first WHEN matches → email: a@w.com.
Row 2 — work_email null but personal_email non-null → first WHEN matches → COALESCE returns b@h.com → email: b@h.com.
Row 3 — both emails null → second WHEN matches → phone: 5553333.
Row 4 — all three null → ELSE → no contact.

Output:

user_id	effective_contact
1	email: a@w.com
2	email: b@h.com
3	phone: 5553333
4	no contact

Why this works — concept by concept:

COALESCE for null-picker — collapses work_email/personal_email into one value with first-non-NULL priority; clean, standard SQL.
CASE for per-branch transformation — different string prefixes per channel; COALESCE cannot express this because every branch needs a different formatting rule.
Composition — COALESCE nested inside CASE is idiomatic; the outer CASE switches behavior, the inner COALESCE resolves null preference.
Guaranteed-non-NULL ELSE — the ELSE 'no contact' ensures the output column is never NULL regardless of input.
O(N) time / O(1) space — single linear pass; per-row constant work for both the CASE and the COALESCE.

Inline CTA: More SQL CASE-when problems and conditional-logic problems for the dialect-portability muscle.

SQL
Topic — case when
SQL CASE-when problems

Practice →

SQL
Topic — conditional logic
SQL conditional-logic problems

Practice →

SQL
Topic — null handling
SQL null-handling problems

Practice →

4. `COALESCE` Pitfalls — NULL Semantics, Type Coercion, Empty Strings, and `NULLIF`

Avoiding silent semantic bugs in COALESCE-heavy SQL for data engineering

"Where does COALESCE quietly produce wrong answers?" is the question senior interviewers probe to separate fluent candidates from rote memorizers. The mental model: COALESCE is purely mechanical — first non-NULL — but the meaning of NULL is not; defaulting NULL to 0 for a financial KPI changes the metric semantics; empty string '' is not NULL so COALESCE skips it; type mixing without CAST is a coercion landmine; combining COALESCE with NULLIF produces the "empty-or-null → default" idiom. Drill the four pitfalls below and you'll never ship a COALESCE bug to production.

Pro tip: State the NULL semantics out loud before writing the SQL. If NULL means "unknown," replacing it with 0 may turn a missing data point into a real-looking zero. The interviewer wants to hear "I'm coalescing to 0 because the metric definition says missing rentals count as zero rentals" — not "I added COALESCE to make the dashboard not show NULL."

`NULL` ≠ `0` — semantic awareness for KPIs and metrics

The semantic invariant: NULL typically means "unknown"; 0 is a concrete numeric fact; replacing NULL with 0 in display code is usually fine but in calculation code can fabricate facts. Treat the substitution as a documented business decision, not a code-style preference.

Display — COALESCE(amount, 0) is fine in dashboards (the user sees 0, knows it's a zero).
Aggregate — AVG(COALESCE(amount, 0)) changes the average semantically; AVG ignores NULL natively, so coalescing first lowers the average.
Math — COALESCE(qty, 0) * price may be wrong if qty IS NULL means "we don't know how much was ordered."
Documented sentinel — for metrics, prefer carrying the NULL and handling it at the report layer.

Worked example. Average of [100, 200, NULL].

approach	computation	average
`AVG(amount)`	`(100+200)/2`	150
`AVG(COALESCE(amount, 0))`	`(100+200+0)/3`	100

Worked-example solution.

-- Choose intentionally; both are valid for different metric definitions
SELECT AVG(amount)                       AS avg_known_only,
       AVG(COALESCE(amount, 0))          AS avg_with_nulls_as_zero,
       COUNT(*)                          AS total_rows,
       COUNT(amount)                     AS non_null_rows
FROM orders;

Rule of thumb: the NULL-vs-0 decision is a metric-definition decision, not a coding style; document it inline.

Empty strings are not `NULL` — combine with `NULLIF`

The empty-string invariant: in PostgreSQL and most engines, '' and NULL are different; COALESCE(col, 'default') returns '' when col = '' (because '' is non-NULL); to treat empty-as-null, wrap with NULLIF(col, '') first.

NULLIF(a, b) — returns NULL when a = b, else returns a.
COALESCE(NULLIF(col, ''), 'default') — the canonical "empty-or-null → default" idiom.
Oracle quirk — Oracle's varchar treats '' AS NULL; not portable.
MySQL behavior — '' is not NULL; same as PostgreSQL.

Worked example. Three rows: real value, empty string, real NULL.

input	`COALESCE(col, 'X')`	`COALESCE(NULLIF(col, ''), 'X')`
`'hello'`	`'hello'`	`'hello'`
`''`	`''`	`'X'`
`NULL`	`'X'`	`'X'`

Worked-example solution.

SELECT raw_value,
       COALESCE(raw_value, 'X')                AS naive_default,
       COALESCE(NULLIF(raw_value, ''), 'X')    AS empty_or_null_default
FROM samples;

Rule of thumb: whenever a string column might be empty AND COALESCE is the right tool, wrap with NULLIF(col, '') to handle both cases.

Type-coercion landmines and explicit `CAST`

The type-coercion invariant: COALESCE returns a single unified type; mixing incompatible types is a parse error in strict engines (PostgreSQL) and a silent coercion in lenient ones (MySQL); CAST keeps intent explicit and review-friendly.

COALESCE(int_col, 'unknown') — error in PostgreSQL.
COALESCE(CAST(int_col AS TEXT), 'unknown') — explicit, portable.
COALESCE(numeric_col, 0) — NUMERIC and INTEGER unify cleanly.
COALESCE(date_col, '1900-01-01'::DATE) — explicit ::DATE cast on the literal.

Worked example. Mixing INTEGER and TEXT cleanly via CAST.

input	naive	with CAST
`qty = 5`	error	`'5'`
`qty = NULL`	error	`'unknown'`

Worked-example solution.

SELECT order_id,
       COALESCE(CAST(qty AS TEXT), 'unknown') AS qty_label
FROM orders;

Rule of thumb: whenever COALESCE mixes families (numeric + string, date + string), insert an explicit CAST on whichever argument needs it; never trust implicit coercion to do the right thing.

Common beginner mistakes

Defaulting NULL to 0 without considering the metric semantics — fabricates zeros into "unknown" rows.
Using COALESCE(col, 'default') for empty strings — '' is not NULL; wrap with NULLIF first.
Mixing INTEGER and TEXT without CAST — parse error in PostgreSQL.
Using COALESCE for string concatenation — that's STRING_AGG / LISTAGG's job.
Forgetting that AVG ignores NULL natively — coalescing first changes the average.

Worked Problem on Empty-or-NULL Default with NULLIF

Given a customers table with a display_name column that is sometimes NULL, sometimes '' (empty string), and sometimes a real value, return a column effective_name that is the real value when present, the literal 'Anonymous' when the input is NULL or empty.

Solution Using `COALESCE(NULLIF(display_name, ''), 'Anonymous')`

SELECT
    customer_id,
    COALESCE(NULLIF(display_name, ''), 'Anonymous') AS effective_name
FROM customers;

Why this works: NULLIF(display_name, '') returns NULL when display_name equals '', else returns display_name unchanged; COALESCE(..., 'Anonymous') then substitutes the literal 'Anonymous' for NULL (which now covers both the original NULL and the converted ''). The composition handles both edge cases in one expression.

Step-by-step trace for four sample rows:

customer_id	display_name	NULLIF(display_name, '')	COALESCE	effective_name
1	`'Alice Lee'`	`'Alice Lee'`	`'Alice Lee'`	`Alice Lee`
2	`''`	`NULL`	`'Anonymous'`	`Anonymous`
3	`NULL`	`NULL`	`'Anonymous'`	`Anonymous`
4	`'Bob B.'`	`'Bob B.'`	`'Bob B.'`	`Bob B.`

Row 1 — real value passes through NULLIF unchanged → COALESCE short-circuits.
Row 2 — empty string converted to NULL by NULLIF → COALESCE substitutes 'Anonymous'.
Row 3 — original NULL passes through NULLIF (NULL = '' evaluates to NULL, not true) → COALESCE substitutes.
Row 4 — real value passes through unchanged.

Output:

customer_id	effective_name
1	Alice Lee
2	Anonymous
3	Anonymous
4	Bob B.

Why this works — concept by concept:

NULLIF(a, b) semantics — returns NULL when a = b, else returns a; converts '' to NULL so COALESCE can treat both as "missing."
Composition with COALESCE — the inner NULLIF normalizes the input, the outer COALESCE applies the default; reads exactly like the business rule "empty-or-null → 'Anonymous'."
No CASE needed — CASE WHEN display_name IS NULL OR display_name = '' THEN 'Anonymous' ELSE display_name END is verbose; the two-function composition is cleaner and equivalent.
Single-column output — effective_name is non-NULL and non-empty by construction; downstream code can treat it as a clean string.
O(N) time / O(1) space — one linear scan; per-row constant work for the NULLIF + COALESCE composition.

Inline CTA: Drill more SQL null-handling problems on PipeCode.

SQL
Topic — null handling
SQL null-handling problems

Practice →

SQL
Topic — conditional logic
SQL conditional-logic problems

Practice →

SQL
Topic — filtering
SQL filtering problems

Practice →

Tips to master COALESCE in SQL

State the rule out loud before writing the query

Every interview answer that touches NULL should open with "left-to-right, first non-NULL, returns NULL only if every argument is NULL." That single sentence demonstrates fluency. Candidates who write COALESCE in the query without naming the rule are graded as memorizers; candidates who name the rule and then write the query are graded as fluent. The rule takes 7 seconds to say; it's worth a level on the interviewer's rubric.

Drill the four primitives

The four primitives in this guide map directly to the surface area COALESCE covers in production: left-to-right evaluation with short-circuit semantics (the rule itself plus the side-effect-safety contract), LEFT JOIN + COALESCE(right_col, default) (the canonical "fact + dim with safe defaults" pattern that drives every BI dashboard), COALESCE vs CASE / ISNULL / NVL / IFNULL (the dialect portability matrix), and the pitfall set (NULL ≠ 0, empty string ≠ NULL, type coercion, NULLIF composition).

Pick PostgreSQL-flavored answers in interviews

Most coding-environment interviews — DataLemur, StrataScratch, LeetCode SQL, CoderPad live coding — default to PostgreSQL. Drill PostgreSQL syntax: COALESCE (not ISNULL), EXTRACT(MONTH FROM ts) (not MONTH(ts)), INTERVAL '30 days' (not DATEADD), || for string concatenation (not +), ::TYPE for casts (not CONVERT(TYPE, ...)).

Combine `COALESCE` with `NULLIF` for empty-or-null

The COALESCE(NULLIF(col, ''), 'default') idiom is the canonical "treat empty string as null and fall back to a default" pattern. Memorize it. PostgreSQL keeps '' and NULL distinct; without the inner NULLIF, COALESCE returns the empty string unchanged. Naming this composition unprompted in interviews is a strong product-fluency signal.

Choose `CASE` when the logic is not "first non-NULL"

COALESCE is for ordered fallbacks. CASE is for everything else — ranges, flags, transformations, branch-specific formatting. If you find yourself writing nested COALESCE calls or COALESCE inside CASE to express conditional logic, the pure CASE is usually clearer. The decision tree: "first non-NULL?" → COALESCE; "anything richer?" → CASE.

Default thoughtfully — `NULL` carries information

A NULL in a financial KPI is information ("we don't know"); replacing it with 0 is a documented business decision. State the decision in the SQL comment or in the surrounding pull-request description. A senior interviewer wants to hear "I'm coalescing to 0 because the metric definition says missing rentals count as zero" — not "I added COALESCE so the dashboard wouldn't show NULL."

Where to practice on PipeCode

Start with the SQL null-handling practice page for the curated set of COALESCE-style problems. After that, drill the matching topic pages: conditional logic, CASE when, joins, aggregation, filtering, window functions, CTE. The SQL for data engineering interviews course bundles structured curricula. For broader coverage, browse all practice problems or pivot to peer guides — the Airbnb DE interview guide, the top DE interview questions 2026 blog, and the SQL data types Postgres guide.

Communication and approach under time pressure

Talk through the rule first ("left-to-right, first non-NULL"), the obvious application second ("here that means prefer work_email then personal_email then a literal default"), and the edge cases third ("if every column is NULL, the literal kicks in; if NULL could mean 0 semantically, I'd think twice about defaulting to 0"). Interviewers grade process as much as the final query. Leave 30 seconds for a sweep: empty string vs NULL, type mixing, side-effect safety on later arguments, the LEFT JOIN-becomes-INNER trap if a WHERE filters the right table.

Frequently Asked Questions

Is `COALESCE` the same as `IFNULL`?

Not exactly. IFNULL (MySQL, BigQuery) and ISNULL (SQL Server) are usually two-argument forms. COALESCE accepts many arguments and is part of the SQL:1992 standard, which is why teams prefer it in portable analytics code. PipeCode's SQL practice is PostgreSQL-oriented, so canonical solutions use COALESCE everywhere. If you're working in a MySQL-only codebase that already uses IFNULL, you can keep it; if you're writing portable code or migrating between engines, switch to COALESCE.

Does `COALESCE` skip evaluating later arguments?

Most major databases short-circuit once they find a non-NULL value — PostgreSQL, MySQL, SQL Server, Oracle, and Snowflake all document this behavior. That's useful for performance, but the SQL standard does not require short-circuit, and you should not rely on side effects in later arguments (volatile functions, RAISE statements, sequence calls). In interviews, saying "typically short-circuits, but I would not depend on side effects" is a strong answer.

Can I nest `COALESCE`?

Yes — COALESCE(a, b, COALESCE(c, d)) works — but a flat list is easier to read and review. Prefer COALESCE(a, b, c, d) when all fallbacks are simple columns or literals. The only legitimate reason to nest is when one fallback itself depends on another expression that needs its own fallback chain.

When should I use `CASE` instead of `COALESCE`?

Use CASE when the logic is not "first non-NULL" — for example, ranges (amount > 1000 → 'large'), flags (is_premium → premium pricing), or different transformations per branch. COALESCE stays the right default when you only need ordered fallbacks. For more conditional patterns, browse conditional-logic SQL on PipeCode and CASE-when problems.

How do I handle empty strings vs `NULL` together?

Use the COALESCE(NULLIF(col, ''), 'default') composition. NULLIF(col, '') returns NULL when col is empty (and returns col unchanged otherwise); the outer COALESCE then treats both the original NULL and the converted-empty as missing and applies the default. This is the canonical PostgreSQL idiom for "missing or blank → default" — drill it; it shows up at every interview that touches dirty user-input data.

What's the right default value for `COALESCE`?

The right default depends on the metric definition, not on coding style. For displays and dashboards, 0 for numerics and 'NONE' / 'Unknown' for strings are common. For aggregates, COALESCE(SUM(x), 0) keeps empty groups visible in reports. For financial or scientific metrics where NULL means "unknown," carry the NULL and handle it at the report layer instead of replacing it. Document the decision inline so the next reviewer knows it was intentional.

Top COALESCE concepts and SQL patterns

1. COALESCE Evaluation Order and Basic Fallback Patterns

Left-to-right first-non-NULL evaluation in SQL for data engineering

Syntax: COALESCE(expr1, expr2, ..., exprN)

Short-circuit evaluation and side-effect safety

Type precedence and CAST for unambiguous types

Common beginner mistakes

Worked Problem on COALESCE Evaluation Order

Solution Using COALESCE with a literal final fallback

2. COALESCE with LEFT JOIN for Default Values in Analytics and BI

COALESCE for outer-join NULL handling in SQL for data engineering

LEFT JOIN + COALESCE(right_col, default) for fact-dim joins

"First non-blank across sources" via chained COALESCE

COALESCE(SUM(x), 0) for grouped aggregates with empty groups

Common beginner mistakes

Worked Problem on LEFT JOIN with COALESCE Default

Solution Using LEFT JOIN + COALESCE(p.promo_code, 'NONE')

3. COALESCE vs CASE, ISNULL, and NVL — Portability and When to Pick Which

Standard SQL portability versus dialect-specific NULL helpers in SQL for data engineering

COALESCE (standard SQL) — the portable default

ISNULL (SQL Server), NVL (Oracle), IFNULL (MySQL) — dialect 2-arg cousins

CASE WHEN ... IS NOT NULL THEN ... ELSE ... END — when logic is not "first non-NULL"

Common beginner mistakes

Worked Problem on Picking the Right NULL Construct

Solution Using COALESCE inside a CASE

4. COALESCE Pitfalls — NULL Semantics, Type Coercion, Empty Strings, and NULLIF

Avoiding silent semantic bugs in COALESCE-heavy SQL for data engineering

NULL ≠ 0 — semantic awareness for KPIs and metrics

Empty strings are not NULL — combine with NULLIF

Type-coercion landmines and explicit CAST

Common beginner mistakes

Worked Problem on Empty-or-NULL Default with NULLIF

Solution Using COALESCE(NULLIF(display_name, ''), 'Anonymous')

Tips to master COALESCE in SQL

State the rule out loud before writing the query

Drill the four primitives

Pick PostgreSQL-flavored answers in interviews

Combine COALESCE with NULLIF for empty-or-null

Choose CASE when the logic is not "first non-NULL"

Default thoughtfully — NULL carries information

Where to practice on PipeCode

Communication and approach under time pressure

Frequently Asked Questions

Is COALESCE the same as IFNULL?

Does COALESCE skip evaluating later arguments?

Can I nest COALESCE?

When should I use CASE instead of COALESCE?

How do I handle empty strings vs NULL together?

What's the right default value for COALESCE?

Start practicing SQL COALESCE problems

1. `COALESCE` Evaluation Order and Basic Fallback Patterns

Syntax: `COALESCE(expr1, expr2, ..., exprN)`

Type precedence and `CAST` for unambiguous types

Solution Using `COALESCE` with a literal final fallback

2. `COALESCE` with `LEFT JOIN` for Default Values in Analytics and BI

`LEFT JOIN` + `COALESCE(right_col, default)` for fact-dim joins

"First non-blank across sources" via chained `COALESCE`

`COALESCE(SUM(x), 0)` for grouped aggregates with empty groups

Solution Using `LEFT JOIN` + `COALESCE(p.promo_code, 'NONE')`

3. `COALESCE` vs `CASE`, `ISNULL`, and `NVL` — Portability and When to Pick Which

`COALESCE` (standard SQL) — the portable default

`ISNULL` (SQL Server), `NVL` (Oracle), `IFNULL` (MySQL) — dialect 2-arg cousins

`CASE WHEN ... IS NOT NULL THEN ... ELSE ... END` — when logic is not "first non-NULL"

Solution Using `COALESCE` inside a `CASE`

4. `COALESCE` Pitfalls — NULL Semantics, Type Coercion, Empty Strings, and `NULLIF`

`NULL` ≠ `0` — semantic awareness for KPIs and metrics

Empty strings are not `NULL` — combine with `NULLIF`

Type-coercion landmines and explicit `CAST`

Solution Using `COALESCE(NULLIF(display_name, ''), 'Anonymous')`

Combine `COALESCE` with `NULLIF` for empty-or-null

Choose `CASE` when the logic is not "first non-NULL"

Default thoughtfully — `NULL` carries information

Is `COALESCE` the same as `IFNULL`?

Does `COALESCE` skip evaluating later arguments?

Can I nest `COALESCE`?

When should I use `CASE` instead of `COALESCE`?

How do I handle empty strings vs `NULL` together?

What's the right default value for `COALESCE`?