SQL Is Not Self-Documenting
A year from now, you won't remember why you wrote that LEFT JOIN with five conditions. Your colleagues definitely won't.
Good SQL is readable SQL.
Format Like You Mean It
-- Bad: good luck reading this at 9 AM
SELECT u.id,u.email,COUNT(o.id) as order_count,SUM(o.total) as lifetime_value FROM users u LEFT JOIN orders o ON o.user_id=u.id WHERE u.created_at>'2024-01-01' AND u.status='active' GROUP BY u.id,u.email HAVING COUNT(o.id)>0 ORDER BY lifetime_value DESC LIMIT 100;
-- Good: obvious structure
SELECT
u.id,
u.email,
COUNT(o.id) AS order_count,
SUM(o.total) AS lifetime_value
FROM users u
LEFT JOIN orders o ON o.user_id = u.id
WHERE
u.created_at > '2024-01-01'
AND u.status = 'active'
GROUP BY u.id, u.email
HAVING COUNT(o.id) > 0
ORDER BY lifetime_value DESC
LIMIT 100;
CTEs Over Subqueries
-- Hard to follow: nested subqueries
SELECT email, total_revenue
FROM (
SELECT user_id, SUM(amount) AS total_revenue
FROM payments
WHERE status = 'succeeded'
GROUP BY user_id
HAVING SUM(amount) > 1000
) high_value_payers
JOIN users ON users.id = high_value_payers.user_id
ORDER BY total_revenue DESC;
-- Easy to follow: CTEs tell a story
WITH high_value_payers AS (
SELECT
user_id,
SUM(amount) AS total_revenue
FROM payments
WHERE status = 'succeeded'
GROUP BY user_id
HAVING SUM(amount) > 1000
)
SELECT
u.email,
hvp.total_revenue
FROM high_value_payers hvp
JOIN users u ON u.id = hvp.user_id
ORDER BY hvp.total_revenue DESC;
CTEs have names. Names communicate intent.
Window Functions Instead of Self-Joins
-- Painful: self-join to get running total
SELECT
o1.id,
o1.created_at,
o1.total,
SUM(o2.total) AS running_total
FROM orders o1
JOIN orders o2 ON o2.created_at <= o1.created_at AND o2.user_id = o1.user_id
GROUP BY o1.id, o1.created_at, o1.total;
-- Clean: window function
SELECT
id,
created_at,
total,
SUM(total) OVER (
PARTITION BY user_id
ORDER BY created_at
ROWS UNBOUNDED PRECEDING
) AS running_total
FROM orders;
Common window functions:
-- Row number within group
ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY created_at)
-- Rank (with gaps for ties)
RANK() OVER (ORDER BY total_revenue DESC)
-- Lead/lag: access adjacent rows
LAG(amount, 1) OVER (PARTITION BY user_id ORDER BY created_at) AS prev_amount
LEAD(amount, 1) OVER (PARTITION BY user_id ORDER BY created_at) AS next_amount
-- Running percentile
PERCENT_RANK() OVER (ORDER BY score)
Avoid SELECT *
-- Never in production code:
SELECT * FROM users;
-- Be explicit:
SELECT id, email, name, created_at FROM users;
Reasons:
- Schema changes break your application silently
- Fetches columns you don't need (slower, more memory)
- Ambiguous in JOINs (which table's
id?)
NULLs Are Tricky
-- This returns no rows even for NULLs!
SELECT * FROM users WHERE deleted_at != '2024-01-01';
-- NULL comparisons need IS NULL / IS NOT NULL
SELECT * FROM users WHERE deleted_at IS NULL OR deleted_at != '2024-01-01';
-- COALESCE: use default when NULL
SELECT COALESCE(display_name, email) AS shown_name FROM users;
-- NULLIF: convert a value to NULL
SELECT NULLIF(status, '') AS status FROM users; -- empty string → NULL
UPSERT Pattern
-- PostgreSQL: INSERT ... ON CONFLICT
INSERT INTO user_preferences (user_id, key, value)
VALUES ('user-123', 'theme', 'dark')
ON CONFLICT (user_id, key)
DO UPDATE SET
value = EXCLUDED.value,
updated_at = NOW();
-- Only update if actually changed
INSERT INTO metrics (date, event, count)
VALUES (CURRENT_DATE, 'pageview', 1)
ON CONFLICT (date, event)
DO UPDATE SET count = metrics.count + EXCLUDED.count;
Date Operations
-- Last 30 days
WHERE created_at >= NOW() - INTERVAL '30 days'
-- Truncate to day/week/month
SELECT
DATE_TRUNC('week', created_at) AS week,
COUNT(*) AS signups
FROM users
GROUP BY DATE_TRUNC('week', created_at)
ORDER BY week;
-- Age in years
SELECT EXTRACT(YEAR FROM AGE(birth_date)) AS age FROM users;
-- Format for display
SELECT TO_CHAR(created_at, 'YYYY-MM-DD') AS formatted_date FROM orders;
The Index on Every Query Rule
For every query that goes to production:
- Run
EXPLAIN ANALYZE - Verify no Seq Scan on large tables
- Add index if needed
This takes two minutes. The alternative is debugging a slow query in production at 2 AM.
SQL patterns, migration utilities, and query optimization built in: Whoff Agents AI SaaS Starter Kit.
Top comments (0)