Gowtham Potureddi

Posted on May 13

SQL for Developers: Relational Foundations, Safe CRUD, Joins, Aggregates & Performance Muscle Memory

#python #sql #interview #dataengineering

SQL for developers is how you read and write the systems you already ship — user accounts, orders, feature flags, observability tables. Backend engineers, full-stack builders, and data engineers share the same primitives: relational tables, stable keys, honest JOIN semantics around NULL, explicit grain for aggregates, and ACID discipline when concurrency hits.

What follows mirrors how teams onboard ICs — schema literacy, guarded CRUD, predicate hygiene, joins without silent row multiplication, GROUP BY / HAVING versus window analytics, then indexes, transactions, and EXPLAIN as your debugging lingua franca — every numbered skill block ends like sql interview questions with answers: runnable Postgres SQL, traced execution, and a terse why. After the hero art, dive straight into reps when you crave keyboard time:

Browse practice hub →, open SQL practice →, deepen joins →, reinforce filters →, or widen with database fundamentals →.

On this page

Why SQL matters for developers and data engineers
Tables, keys, and the shape of relational data
Reading and writing rows — SELECT, INSERT, UPDATE, DELETE safely
Filtering, NULLs, sorting, and LIMIT
Joins — INNER, LEFT, and when rows multiply
GROUP BY, HAVING, and analytics-style windows
Indexes, transactions, ACID, and EXPLAIN-aware debugging
Choosing SQL skills for your stack (checklist)
Frequently asked questions
Practice on PipeCode

1. Why SQL matters for developers and data engineers

The relational contract hiding behind every HTTP handler

Invariant: SQL is the persisted half of almost every SaaS workload — signup rows, entitlement tables, payout ledgers. SQL for developers fluency separates engineers who prototype quickly from teammates who confidently answer "why did this counter disagree with finance?" without escalating.

Detailed explanation. Application tiers think in verbs (POST, PATCH, enqueue) while relational engines expose predicates, constraints, joins, and transactions. When those layers disagree, outages look like phantom bugs but read as inconsistent reads, missing indexes, duplicate grain, or unscoped transactions underneath.

Pro tip: Name grain aloud — exactly one row means one _______ — before you accept a JOIN plan or BI metric.

Why backends, analytics, and SRE converge on SQL

Everybody eventually asks Postgres the same primitives: correlations, aggregates, cardinality checks. Showing up fluent collapses Slack threads into scripts you can rerun, diff, commit to sql/ snippets, instrument in CI smoke tests — even if warehouses later optimise the OLAP clones.

Worked example.

persona	recurrent question type	payoff from SQL literacy
backend	`UPDATE` fan-out / orphaned FK rows	deterministic migrations
DS / analytics	reproducible cohort filters	parameterized queries
SRE	blast-radius queries during incidents	`EXPLAIN`-aware rollbacks

Step-by-step.

Incident alert references revenue drift — replicate via JOIN spanning payments ⇄ refunds.
Count rows twice — once naive, once with explicit DISTINCT grain_key assertions.
Validate indexes cover WHERE + JOIN predicates in staging before prod deploy.
Document literal-free SQL snippets for analytics parity.

OLTP versus analytic workloads

workload	emblematic SQL	optimise for
OLTP (`INSERT`, `UPDATE` critical rows)	`UPDATE … WHERE id = $1 RETURNING …`	short latch-friendly txn windows
Warehouse / OLAP	`SUM(x) GROUP BY day` (+ optional windows)	columnar parallelism, partition pruning
Replica analytics bridging both	parameterized slice queries	reproducible predicates + timeouts

Detailed explanation. Mixed-mode Postgres instances still differentiate by predicate selectivity, txn duration, and hardware headroom. Front-line CRUD hates long reads holding locks; heavyweight BI tolerates eventual freshness but demands honest scan plans.

Common beginner traps

Mirroring spreadsheets — unstructured columns creep into DDL without reviewers.
Trusting dashboards over source tables — BI layers often coerce grain silently.
ORM-only troubleshooting — you still need emitted SQL logs for hidden N+1 joins.
Ignoring isolation trade-offs — READ COMMITTED anomalies appear only under concurrency realism.

2. Tables, keys, and the shape of relational data

Tables are unordered multisets guarded by declarative constraints

Invariant: a row-oriented table expresses one record type; keys pin identity while foreign keys articulate relationships declaratively rather than scattering pointer logic exclusively in Ruby/Java services.

Primary versus natural identifiers

Natural keys mirror business artefacts (ISO country code). Surrogate IDs (BIGSERIAL) stay stable across refactors yet carry no domain meaning alone. Postgres typically combines both: surrogate PK for joins, UNIQUE(email) to enforce human-facing identity.

Worked example.

design	advantage	caveat
`BIGSERIAL PRIMARY KEY`	immutable join edges	meaningless for humans
`UNIQUE(email)`	legible dashboards	brittle if mergers rename emails

Worked-example DDL commentary. BIGSERIAL auto-sequence ensures monotonic surrogates with cheap index inserts. NOT NULL on email encodes onboarding invariants Postgres enforces deterministically versus optional app validation.

CREATE TABLE users (
    id          BIGSERIAL PRIMARY KEY,
    email       TEXT NOT NULL UNIQUE,
    city        TEXT,
    created_at  TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

CREATE TABLE orders (
    order_id    BIGSERIAL PRIMARY KEY,
    user_id     BIGINT NOT NULL REFERENCES users(id) ON DELETE RESTRICT,
    total_usd   NUMERIC(14,2) NOT NULL CHECK (total_usd >= 0),
    placed_at   TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

CREATE INDEX orders_user_idx ON orders (user_id);

Foreign-key delete semantics matter in production churn

The difference between ON DELETE CASCADE (waves downstream deletes) versus RESTRICT (blocking deletes referencing kids) dictates safe admin tooling workflows. Prefer explicit policies over implicit defaults guessed during incidents.

-- Example: disallow deleting buyers with unpaid orders lingering
ALTER TABLE orders
DROP CONSTRAINT orders_user_id_fkey,
ADD CONSTRAINT orders_user_id_fkey
    FOREIGN KEY (user_id)
    REFERENCES users(id)
    ON DELETE RESTRICT;

Step-by-step.

Model parent (users), child (orders) cardinality first.
Select delete policy matching business law — finance rarely cascades blindly.
Index child FK columns (CREATE INDEX on orders.user_id).
Add CHECK constraints early — cheaper than patching corrupt rows later.

Common beginner mistakes

Deferring FK creation → silent orphan rows creep under concurrent writers.
Currency as floats — use NUMERIC(p,s) for monetary columns.
Timestamp without zone interpreted as UTC — prefer TIMESTAMPTZ + explicit TZ policy.
Overloading JSONB exclusively — structured columns keep optimiser hints honest.

Quick integrity checklist

check	Postgres hook
uniqueness	`UNIQUE`, partial unique indexes
domain logic	`CHECK`, domain types
referential coupling	declarative FK + chosen `ON DELETE`
auditing	triggers or append-only ledger tables

Rule of thumb: if two services disagree on cardinality, converge on DDL truth before arguing in Slack.

3. Reading and writing rows — SELECT, INSERT, UPDATE, DELETE safely

Every writer path needs identity — explicit filters and returning projections

Invariant: INSERT, UPDATE, DELETE must name which rows mutate (WHERE), prefer parameters ($1) over interpolated strings, and return affected rows (RETURNING) when callers need confirmations without another round-trip.

Transactions wrap multi-leg business truths

Transfers, swaps, entitlement downgrades seldom touch one isolated row atomically satisfying business law. Wrap coordinated statements (BEGIN … COMMIT) so partial states never linger.

BEGIN;
UPDATE accounts SET balance = balance - 100 WHERE id = 1;
UPDATE accounts SET balance = balance + 100 WHERE id = 2;
COMMIT;

Why this works:

__BEGIN__ opens snapshot / locking context per isolation level chosen.
Two ordered UPDATEs express money conservation — auditors expect atomicity tie.
__COMMIT__ publishes both deltas together; ROLLBACK; rewinds catastrophes.
`Cost — transactional overhead negligible versus financial inconsistency fallout.

Parameterized predicates prevent injection and cache reuse

Never splice user strings manually — placeholders keep plans cacheable and thwart SQL injection.

sql DELETE FROM sessions WHERE user_id = $1 AND expires_at < NOW();

INSERT patterns developers lean on daily

Bulk ingest + upsert choreography appears constantly — understand both single-row ergonomics (RETURNING) and batch paths.

`sql
INSERT INTO users (email, city)
VALUES ('dev@corp.com', 'Chennai')
RETURNING id, created_at;

INSERT INTO audit_log(event, payload)
VALUES
('password_reset', '{"user_id": 42}'::jsonb),
('mfa_challenge', '{"user_id": 42}'::jsonb);
`

Worked example — idempotent onboarding upserts (ON CONFLICT).

sql INSERT INTO user_profiles (user_id, display_name) VALUES ($1, $2) ON CONFLICT (user_id) DO UPDATE SET display_name = EXCLUDED.display_name, updated_at = NOW();

Detailed explanation.

Conflicting rows recycle through EXCLUDED pseudo-table exposing proposed values.
Pair with partial unique indexes for conditional uniqueness scenarios (invite tokens, soft deletes).

Safe UPDATE hygiene checklist

step	rationale
dry-run SELECT clone	verifies row cardinality before mutation
`LIMIT` + key filter	avoids blanket table rewrite
`RETURNING old.*` auditing	emits before/after diffs programmatically

Rule of thumb: if deleting “temp junk,” still scope by timestamp + TTL — surprise full-table wipes bankrupt trust faster than slow queries.

4. Filtering, NULLs, sorting, and LIMIT

Logical evaluation order is not top-to-bottom textual order

Invariant: Engines conceptually apply clauses in this order — FROM → JOIN → WHERE → GROUP BY → HAVING → windowing → SELECT expressions → DISTINCT → ORDER BY → LIMIT/OFFSET. Textual SQL order differs deliberately; misplacing expectations causes “phantom” aggregates or illegal references.

Detailed explanation.

Predicates in WHERE see raw row grain before collapsing via GROUP BY.
SELECT aliases rarely appear inside WHERE (Postgres exceptions exist for subqueries, not shortcuts).
ORDER BY runs after projection — you may sort computed expressions.

NULL is tri-valued logic (`TRUE`, `FALSE`, `UNKNOWN`)

Comparisons involving UNKNOWN ripple through compound predicates unpredictably unless you memorize De Morgan interactions with AND/OR.

`sql
SELECT * FROM users WHERE city IS NULL; -- correct
SELECT * FROM users WHERE city = NULL; -- ALWAYS UNKNOWN → zero rows (pitfall)

SELECT *
FROM experiments
WHERE status IS DISTINCT FROM outcome; -- treats NULL knowingly
`

Worked example.

predicate	evaluates when `city` NULL
`city = 'Chennai'`	UNKNOWN
`city IS NULL`	TRUE

Worked-example pattern — COALESCE bridging optional columns.

sql SELECT COALESCE(city, 'unknown') AS city_label FROM users;

Combine with NULLIF to coerce sentinel blanks into canonical NULL semantics.

Composing predicates responsibly

Prefer explicit parentheses when mixing AND/OR:

sql SELECT id, email FROM users WHERE (city IN ('Hyderabad', 'Chennai') OR vip IS TRUE) AND suspended IS FALSE;

ORDER BY, LIMIT:

sql SELECT id, email FROM users WHERE city IN ('Hyderabad', 'Chennai') ORDER BY email ASC LIMIT 25 OFFSET 50;

Detailed explanation.

Large OFFSET scans skipped rows wastefully — keyset pagination (WHERE id > $cursor ORDER BY id LIMIT) often cheaper at scale even if ergonomically heavier.
Stable sorts duplicate tie-break columns (ORDER BY created_at DESC, id DESC) to defeat nondeterministic pages.

Common beginner traps

NOT IN (...) collapses unexpectedly when inner list harbors NULL (entire predicate UNKNOWN).
BETWEEN inclusive endpoints surprise folks expecting half-open intervals.
ILIKE '%foo%** cannot exploit plain B-tree indexes without pg_trgm / expression indexes.

Optional pattern — existential filters with EXISTS

Prefer semijoins when probing presence without caring about multiplicity:

sql SELECT u.id FROM users u WHERE EXISTS ( SELECT 1 FROM orders o WHERE o.user_id = u.id );

Rule of thumb: when debugging filters, SELECT COUNT(*) before and after layering predicates — divergence isolates offending clause fast.

5. Joins — INNER, LEFT, and when rows multiply

Joins reshape cardinality — tame fan-out before aggregates

Invariant: JOIN combines row sets via predicates (typically equality on keys). When either side is one-to-many, the result repeats left rows once per matching right row unless you stabilize grain first.

Detailed explanation.

INNER JOIN emits only pairs that satisfy the ON clause — unmatched rows on either side disappear.
LEFT [OUTER] JOIN keeps every row from the left spine; unmatched right columns become NULL (sentinel absence, not “empty string”).
FULL OUTER is rarer in app code but useful when reconciling two feeds where either side might be orphaned.

`sql
-- Buyer activity: users who ordered at least once (one row per matching order pair)
SELECT u.name, o.order_id
FROM users u
JOIN orders o ON o.user_id = u.id;

-- Anti-join — users present on the LEFT with NO matching orders on the RIGHT
SELECT u.name
FROM users u
LEFT JOIN orders o ON o.user_id = u.id
WHERE o.order_id IS NULL;
`

INNER vs LEFT in one rehearsal table

join	survives without match on opposite side	read it as
`INNER`	no	“pairs only — inner intersection”
`LEFT`	yes (left survives)	“keep cohort A, optionally attach B”

Fan-out rehearsal (why `COUNT(*)` lied)

spine	facts	INNER join rows if 3 matching orders
1 Alice	3 orders for Alice	3 rows named Alice

COUNT(*), SUM(amount) downstream now count result rows, not necessarily distinct users. Fix upstream with distinct keys, sub-aggregates, or semi-joins (EXISTS / IN) depending on semantics.

Developer SQL interview question — users who never ordered

Tables: users(id, name) and orders(order_id, user_id) with referential integrity. List dormant accounts (users who never placed an order), ordered deterministically.

Solution Using LEFT JOIN anti-join

Code solution.

sql SELECT u.id, u.name FROM users u LEFT JOIN orders o ON o.user_id = u.id WHERE o.order_id IS NULL ORDER BY u.id;

Step-by-step trace.

step	planner story
1	Build left spine — one output row candidate per user.
2	For each user, seek/probe matching orders on `orders.user_id`.
3	If no row qualifies, *`o.` becomes NULL**, including surrogate `order_id`.
4	`WHERE o.order_id IS NULL` keeps only unmatched left rows (anti-pattern if `order_id` could be NULL absent FK — integrity prevents that here).

Output:

id	name
404	dormant

Companion pattern — NOT EXISTS (duplicate-safe semi-join).

sql SELECT u.id, u.name FROM users u WHERE NOT EXISTS ( SELECT 1 FROM orders o WHERE o.user_id = u.id ) ORDER BY u.id;

Why this works — concept by concept:

LEFT preservation — every user survives until filtered; dormant cohort remains visible.
NULL sentinel — with a real surrogate key on orders, order_id NULL reliably means no attachment, not ambiguous data void.
NOT EXISTS — logically ignores duplicate orders per user (semijoin); no accidental multiplier from exploding the right side.
Index leverage — B-tree on orders(user_id) turns probes into seeks instead of scans.
Cost — hash or merge-family join tends toward Θ(n + m) cardinality with healthy selectivity versus nested-loop blowups on accidental cross products.

SQL Topic — joins Join-heavy SQL drills

Practice →

SQL
Language — SQL
SQL language library

Practice →

6. GROUP BY, HAVING, and analytics-style windows

Collapse grain or stay row-wise — aggregates vs windows split the job

Invariant: GROUP BY collapses rows into buckets, producing one output row per group after aggregation. OVER() keeps input grain — every source row survives while you decorate it with comparative metrics (rank, running sum).

Detailed explanation.

After JOIN/WHERE, the relational engine optionally groups by the listed expressions. SELECT may reference either grouping keys or aggregate functions evaluated inside each bucket — stray base columns violate the contract unless functionally dependent (Postgres errs early; beware lenient dialects permitting hidden ambiguity).
HAVING filters post-aggregation predicates (COUNT(*) > 10), while WHERE trims rows before bucketing (cannot cite aliases from SELECT aggregates — repeat the aggregate or nest a subquery).
Window frames (ROWS / RANGE / GROUPS) default per function; analytic ranking (ROW_NUMBER, RANK, DENSE_RANK) ignores frame — they only need PARTITION BY + ORDER BY.

Stock patterns side by side

`sql
-- Collapse city grain — one row per city after aggregation
SELECT city, COUNT() AS n
FROM users
GROUP BY city
HAVING COUNT() > 5;

-- Decorate employee grain — ranking within department without collapsing rows
SELECT emp_id,
dept_id,
salary,
ROW_NUMBER() OVER (PARTITION BY dept_id ORDER BY salary DESC) AS rn
FROM employees;
`

Dedup companion — deterministic keeper rows

Grouped counts diagnose violators; windows remediate duplicates while picking one canonical row (Partition by key, ORDER BY audit columns, filter WHERE rn = 1):

sql WITH ranked AS ( SELECT *, ROW_NUMBER() OVER ( PARTITION BY email ORDER BY updated_at DESC NULLS LAST, id ASC ) AS rn FROM users ) SELECT * FROM ranked WHERE rn > 1; -- offenders to DELETE or archive in a batched migration

Developer SQL interview question — duplicate emails awaiting cleanup

List normalized emails appearing more than once with violation counts.

Solution Using grouped counts

Code solution.

sql SELECT email, COUNT(*) AS dup_count FROM users GROUP BY email HAVING COUNT(*) > 1 ORDER BY dup_count DESC, email;

Step-by-step trace.

step	computation
1	Normalize upstream if needed (`LOWER(TRIM(email))` in ingestion or projection).
2	`GROUP BY email` emits one accumulator per distinct key.
3	`HAVING COUNT(*) > 1` drops singleton buckets — only violators survive.

Output:

email	dup_count
dev@corp	3

Why this works — concept by concept:

GROUP BY bucket — each email becomes its own multiset; COUNT(*) measures multiplicity after predicates.
Having vs where — WHERE cannot express COUNT(*) > 1 without a subquery; HAVING runs on aggregated state.
Operational pairing — follow with ROW_NUMBER() partitioning on the same dedupe key to choose survivors without arbitrary ties.
Cost profile — hash aggregate typically linear in spilled row volume; spilled sorts add I/O proportional to work_mem pressure.

SQL Topic — aggregation Aggregation drills

Practice →

SQL
Topic — window functions
Window SQL drills

Practice →

SQL
Topic — group-by
GROUP BY drills

Practice →

7. Indexes, transactions, ACID, and EXPLAIN-aware debugging

Indexes + planners + transactional contracts separate “queries” from “systems”

Detailed explanation.

Default B-tree indexes accelerate equality / range predicates (WHERE city = 'Hyderabad' or WHERE created_at BETWEEN …) via seeks instead of sequential scans once selectivity warrants them.
Indexes are derived state — every insert/update/delete touches matching index entries (write amplification); partial indexes prune maintenance when predicates cover a cohort (WHERE churned IS FALSE).
CREATE INDEX CONCURRENTLY avoids long ACCESS EXCLUSIVE locks on Postgres hot tables during build — trade-off is longer DDL and retry bookkeeping if build fails midway.

`sql
CREATE INDEX CONCURRENTLY idx_users_city ON users (city);

EXPLAIN (ANALYZE, BUFFERS)
SELECT *
FROM users
WHERE city = 'Hyderabad';
`

Interview-grade EXPLAIN reading hints.

Prefer EXPLAIN (ANALYZE, BUFFERS) for truth about buffers hit vs read, loops, actual row counts — textual plans alone omit runtime skew.
Watch estimated vs actual row mismatches (bad stats ⇒ wrong join order, surprise nested loops on big inner sides).
Seq Scan acceptable on narrow tables / cold caches; punitive when millions of qualifying rows funnel through filter after scan.

ACID anchors

letter	mnemonic	what to say in-panel
A	Atomicity	all statements commit together or `ROLLBACK` rewinds observable effects
C	Consistency	constraints + declarative checks hold on commit boundaries
I	Isolation	levels trade phantom reads vs concurrency — `READ COMMITTED` default on Postgres exposes committed deltas each statement; `REPEATABLE READ` / MVCC snapshots tame drift for longer read transactions
D	Durability	WAL persists committed work across crashes

Practical transaction hygiene

Declare explicit boundaries (BEGIN / COMMIT) when orchestrating multi-table invariants — ORMs emitting autocommit per statement fracture money-movement narratives. Serialization failures (SQLSTATE 40001) under SERIALIZABLE / snapshot conflicts signal retry opportunities, not “random database bugs.” Pair schema changes with CONCURRENT index creations and phased backfills whenever traffic tolerates eventual consistency stages.

Choosing SQL skills for your stack (checklist)

horizon	competency	Practice lane
Week 1	DDL + FK stories	/explore/practice/topic/database →
Week 2	predicates + paging	/explore/practice/topic/filtering/sql →
Week 3	joins + cardinality	/explore/practice/topic/joins/sql →
Week 4	aggregates/windows	/explore/practice/topic/aggregation/sql →

Frequently asked questions

Which dialect first?

PostgreSQL is the pragmatic default — rich standard SQL surface, expressive JSONB, strong isolation story, ubiquitous in modern stacks, and interview panels often cite it explicitly. Treat SQLite as excellent for correctness drills and EXPLAIN QUERY PLAN intuition, MySQL where legacy hiring signals demand it — but defer dialect trivia until relational mechanics feel automatic.

LEFT JOIN … IS NULL versus NOT EXISTS?

Both express relational difference (anti-semijoins). LEFT JOIN + NULL filter is readable when order_id is a dependable surrogate and you want symmetry with exploratory outer joins. NOT EXISTS scales mentally when duplicates on the right explode intermediate row counts — multiplicity does not falsify existential absence. Mention both in reviews; pick based on cardinality and planner friendliness verified with EXPLAIN.

When do aggregates beat windows?

Collapse to metric tables or cohort summaries ⇒ GROUP BY + aggregates + HAVING. Preserve row grain while ranking/deduping/running deltas ⇒ OVER() partitions. Mixed patterns stack CTE layers: aggregate subtotals upstream, JOIN keyed aggregates back, then window across enriched rows once uniqueness returns.

Are indexes unequivocally good?

No — each index consumes storage, lengthens mutation paths (insert/update hotspots), and complicates migrations. Prefer narrow partial indexes, covering composites aligning with ORDER BY when read savings dominate, but baseline with metrics (p95 latency, churn on write-heavy queues) rather than speculative indexing folklore.

What proves seniority fastest?

Fluent EXPLAIN storytelling (buffer churn, mis-estimates), articulating isolation anomalies you have actually chased (lost updates, phantom reads), and disciplined schema evolution (locking, backfills, concurrency-safe DDL). Read alone plateaus — annotate past incidents with reproduction SQL and planner diffs during debrief loops.

Do ORMs replace raw SQL literacy?

ORMs scaffold CRUD ergonomics quickly, yet N+1, implicit transaction scopes, brittle migrations, deadlock graphs, and hot-index regressions still surface raw SQL realities. Seniors jump between declarative mappings and handwritten SQL confidently because production incidents rarely respect abstraction boundaries exclusively.

Practice on PipeCode

PipeCode ships 450+ interview-grade problems spanning SQL joins, aggregates, transactional logic, analytics windows, subqueries, and pragmatic schema debugging. Anchor on Explore practice →, escalate through language SQL →, grab sets like topic SQL → or subqueries →, and unlock plans → whenever you want unrestricted runs.

1. Why SQL matters for developers and data engineers

The relational contract hiding behind every HTTP handler

Why backends, analytics, and SRE converge on SQL

OLTP versus analytic workloads

Common beginner traps

2. Tables, keys, and the shape of relational data

Tables are unordered multisets guarded by declarative constraints

Primary versus natural identifiers

Foreign-key delete semantics matter in production churn

Common beginner mistakes

Quick integrity checklist

3. Reading and writing rows — SELECT, INSERT, UPDATE, DELETE safely

Every writer path needs identity — explicit filters and returning projections

Transactions wrap multi-leg business truths

Parameterized predicates prevent injection and cache reuse

INSERT patterns developers lean on daily

Safe UPDATE hygiene checklist

4. Filtering, NULLs, sorting, and LIMIT

Logical evaluation order is not top-to-bottom textual order

NULL is tri-valued logic (TRUE, FALSE, UNKNOWN)

Composing predicates responsibly

Common beginner traps

Optional pattern — existential filters with EXISTS

5. Joins — INNER, LEFT, and when rows multiply

Joins reshape cardinality — tame fan-out before aggregates

INNER vs LEFT in one rehearsal table

Fan-out rehearsal (why COUNT(*) lied)

Developer SQL interview question — users who never ordered

Solution Using LEFT JOIN anti-join

6. GROUP BY, HAVING, and analytics-style windows

Collapse grain or stay row-wise — aggregates vs windows split the job

Stock patterns side by side

Dedup companion — deterministic keeper rows

Developer SQL interview question — duplicate emails awaiting cleanup

Solution Using grouped counts

7. Indexes, transactions, ACID, and EXPLAIN-aware debugging

Indexes + planners + transactional contracts separate “queries” from “systems”

ACID anchors

Practical transaction hygiene

Choosing SQL skills for your stack (checklist)

Frequently asked questions

Which dialect first?

LEFT JOIN … IS NULL versus NOT EXISTS?

When do aggregates beat windows?

Are indexes unequivocally good?

What proves seniority fastest?

Do ORMs replace raw SQL literacy?

Practice on PipeCode

NULL is tri-valued logic (`TRUE`, `FALSE`, `UNKNOWN`)

Fan-out rehearsal (why `COUNT(*)` lied)