Readable SQL isn’t a myth. It’s a WITH.
You know that feeling when you open a SQL query you wrote six months ago, see seven levels of nested subqueries, and think "Which psychopath wrote this?" Then you check git blame and... yeah. It was you.
I've been there. We've all been there.
This is Part 1 of a 3-part series on Common Table Expressions (CTEs) - a powerful SQL feature that's been hiding in plain sight for years. Your database supports it. Your competitors probably use it. But your ORM? Nah, it's too busy pretending SQL peaked in 1999.
TL;DR
- CTEs = named subqueries that make you look smart
- They make complex queries cleaner, faster, and maintainable
- Every modern database supports them (PostgreSQL since 2009, MySQL since 8.0, SQL Server since 2005)
- Most ORMs don't (Doctrine, Hibernate, Eloquent... looking at you)
- Primary benefit: readability and maintenance - performance gains are a bonus
- Use
EXPLAINto verify optimization
This series:
- Part 1 (this): Basic CTEs - what they are and why you need them
- Part 2: Recursive CTEs for hierarchical data
- Part 3: How to use CTEs when your ORM says no
Quick glossary:
- CTE (Common Table Expression): A named temporary result set, like a subquery with a name
- ORM (Object-Relational Mapping): Tool that lets you query databases using objects instead of SQL (Doctrine, Hibernate, Eloquent, etc.)
- Materialization: When the database calculates a CTE once and stores the result temporarily
The Problem with Subqueries
Let's start with a simple example. You want to find all users who placed orders after January 1st, 2025:
SELECT * FROM users
WHERE id IN (
SELECT DISTINCT user_id
FROM orders
WHERE order_date > '2025-01-01'
);
Works fine. Ship it.
But what happens when you need that same list of "recent order users" in three different places in your query? Copy-paste the subquery three times? Now you have three places to update when the business logic changes. Three places to introduce bugs. Three places where a typo can ruin your weekend.
And if the subquery gets more complex - joins, aggregations, filters - you're looking at a maintenance nightmare wrapped in a readability disaster.
As I mentioned in my article on database design, queries deserve thought before execution. I PRODUCE code, I don't vomit it.
Enter CTEs: SQL for the Efficiently Lazy
Just like in LLM workflows, structure matters. In AgentKit I argued that efficient laziness is about designing reliability before execution — CTEs bring that same mindset to SQL.
Common Table Expressions (CTEs) let you name your subqueries. That's it. That's the magic.
Here's the same query with a CTE:
WITH recent_orders AS (
SELECT DISTINCT user_id
FROM orders
WHERE order_date > '2025-01-01'
)
SELECT users.*
FROM users
JOIN recent_orders ON users.id = recent_orders.user_id;
"But that's longer!" you say. Sure, for this tiny example. But look what happens when your query grows:
WITH recent_orders AS (
SELECT DISTINCT user_id FROM orders
WHERE order_date > '2025-01-01'
),
high_value_users AS (
SELECT user_id FROM orders
GROUP BY user_id
HAVING SUM(total_amount) > 1000
)
SELECT
u.*,
COUNT(o.id) as order_count,
SUM(o.total_amount) as total_spent
FROM users u
JOIN recent_orders ro ON u.id = ro.user_id
JOIN high_value_users hvu ON u.id = hvu.user_id
LEFT JOIN orders o ON u.id = o.user_id
GROUP BY u.id;
Each CTE has a clear name and purpose. You can reference them multiple times. Change the definition once, it updates everywhere. Debug one piece at a time.
When Should You Use a CTE?
Quick checklist:
✅ You're copying the same subquery 2+ times
✅ Your query has 2+ levels of nested subqueries
✅ You need to name intermediate results for clarity
✅ You're building complex calculations step by step
Why CTEs are better:
✅ Readable - Each part has a meaningful name
✅ Reusable - Reference the same CTE multiple times in one query
✅ Maintainable - Change the logic in one place
✅ Optimizable - The database engine can inline or materialize as needed
Important nuance: The main benefit is readability and maintainability. Performance improvements depend on your database's optimizer - sometimes the CTE is inlined (treated like a regular subquery), sometimes materialized (calculated once and cached). Both can be efficient; it depends on the query.
This is efficient laziness. Write less code, maintain less code, debug less code.
Real-World Case: 30,000 Daily Sales
Let's get practical. Imagine you're analyzing sales for a company processing 30,000 transactions per day. You need quarterly sales totals, then use those totals for multiple analyses: year-over-year comparison, growth percentage, regional breakdowns.
Without CTEs, you'd write the quarterly aggregation multiple times as subqueries. Even if the optimizer handles it well, your code is messy and hard to maintain.
With a CTE, you express your intent clearly:
WITH quarterly_sales AS (
SELECT
DATE_TRUNC('quarter', sale_date) AS quarter,
SUM(sale_amount) AS total_sales
FROM sales
WHERE country = 'France' AND brand = 'Ikea'
GROUP BY quarter
)
SELECT
quarter,
total_sales,
LAG(total_sales) OVER (ORDER BY quarter) AS prev_quarter_sales,
ROUND(
100.0 * (total_sales - COALESCE(LAG(total_sales) OVER (ORDER BY quarter), 0))
/ COALESCE(LAG(total_sales) OVER (ORDER BY quarter), 1),
2
) AS growth_pct
FROM quarterly_sales
ORDER BY quarter;
Quick explanation of new concepts:
-
DATE_TRUNC('quarter', sale_date): Groups dates by quarter (Q1 2024, Q2 2024, etc.) -
LAG(): Window function that accesses the previous row's value -
COALESCE(): Returns first non-null value (handles missing data gracefully)
What just happened?
- CTE calculates quarterly totals once - aggregating potentially millions of rows into a few dozen quarterly summaries
- Main query uses that small result set - fast window functions on already-aggregated data
- Database optimizer decides - inline or materialize based on cost analysis
Real-world gains:
- Code clarity: Immediately obvious what each part does
- Maintainability: Change quarterly logic once, everything updates
- Performance: Often 2-10x faster when the optimizer materializes (but measure with
EXPLAIN!) - User happiness: Reports load faster, dashboards don't timeout
For a financial dashboard or reporting system, this isn't theoretical optimization - it's the difference between "I'll grab coffee while this loads" and "it just works."
Understanding Performance: Trust the Optimizer
"But I heard CTEs are slow!"
Let me stop you right there. This myth needs to die.
Modern databases are smart. When you write a CTE, the query optimizer analyzes the whole query and chooses the best strategy:
- Inline it - Treat it like a regular subquery, optimize the whole thing together
- Materialize it - Calculate once, store temporarily, reuse the result
Which approach? The optimizer decides based on:
- Size of the CTE result
- How many times you reference it
- Available indexes
- Overall query complexity
Your job: Write clear, maintainable queries. The optimizer's job: make them fast.
Verify with EXPLAIN
Don't guess at performance - measure it:
EXPLAIN ANALYZE
WITH sales_summary AS (
SELECT region, SUM(amount) as total
FROM sales
WHERE date >= '2024-01-01'
GROUP BY region
)
SELECT * FROM sales_summary WHERE total > 100000;
PostgreSQL will show you:
- Whether the CTE was inlined or materialized
- Scan types (sequential vs index)
- Actual vs estimated rows
- Execution time
MySQL, SQL Server, and Oracle have similar tools (EXPLAIN, SHOW PLAN, etc.).
💡 Pro tip: In Part 3, we'll cover advanced techniques like forcing materialization when you have evidence the optimizer is wrong. But 99% of the time? Trust the optimizer.
Database Support: The Timeline
CTEs are part of the SQL:1999 standard, but adoption took time. Here's when major databases got on board:
| Database | CTE Support Since | Notes |
|---|---|---|
| SQL Server | 2005 | The pioneer - full recursive support from day one |
| PostgreSQL | 2009 (v8.4) | Excellent implementation, developer favorite |
| Oracle | ~2010 (11g R2) | Added WITH alongside existing CONNECT BY |
| MySQL | 2018 (v8.0) | Better late than never - before this, nothing |
| MariaDB | 2018 (v10.2.2) | Followed MySQL's lead the same year |
| IBM Db2 | 2004-ish (varies) | Early adopter, solid support |
Translation: If you're using any remotely modern database, you have CTEs. Even MySQL finally caught up in 2018.
So CTEs are standard. Databases support them. Developers love them. Everything's great, right?
The Plot Twist: Your ORM Doesn't Care
Here's where it gets frustrating. Check your ORM's documentation for CTE support:
❌ Doctrine (PHP) - No .with() method in QueryBuilder
❌ Hibernate (Java) - Nothing in HQL or Criteria API
❌ Eloquent (Laravel) - Need third-party package (staudenmeir/laravel-cte)
❌ Active Record (Rails) - Limited support via gems, nothing native
Why? Most ORMs were architected when major databases didn't universally support CTEs. They prioritized portability (supporting older database versions) and simplicity (abstracting SQL away). Fair trade-off for 2005. Questionable for 2025.
The exceptions: SQLAlchemy (Python), jOOQ (Java), and Django 4.2+ support CTEs natively. We'll cover them in Part 3.
The result? If you want CTEs with most ORMs, you're writing raw SQL. You lose some ORM benefits (automatic joins, relationship mapping), but gain the power of modern SQL.
This isn't a bug. ORMs prefer to hide complexity rather than embrace it. But when you need advanced features, that abstraction becomes limiting.
What This Means for You
You've just learned that:
- CTEs make complex queries cleaner and easier to maintain
- Performance gains are common (but verify with
EXPLAIN) - Every modern database supports them
- Your ORM probably doesn't
So what do you do?
Short term: Start recognizing places where CTEs would help. Look for repeated subqueries, complex aggregations, or queries that make you squint.
Medium term: Learn to write CTEs in raw SQL. We'll cover practical strategies for using them with ORMs in Part 3.
Long term: Push your ORM maintainers to add CTE support. Or switch to one that already has it.
For now, just remember: CTEs exist. They're powerful. Your database already supports them.
Coming Next
In Part 2, we're diving into recursive CTEs - the feature that makes hierarchical data actually manageable.
Ever needed to traverse an org chart? Find all subcategories under a category? Calculate a bill of materials? Build a comment thread with infinite nesting?
You could write multiple database queries. Or complex application logic with loops. Or you could write one elegant recursive CTE and let the database do what databases do best.
"Recursive CTEs: Because Writing Nested Loops for Trees is Psychopathic (and a Little Masochistic Too)" - Coming soon.
Bottom Line
Common Table Expressions aren't rocket science. They're named subqueries that make your SQL readable and maintainable.
They've been part of standard SQL since 1999. Every major database supports them. They're not experimental. They're not dangerous. They're just... better for complex queries.
The primary benefit is clarity. Performance improvements are a bonus that depends on your specific query and database optimizer. But even if performance stayed the same, the readability gain alone makes CTEs worth using.
Your ORM might not support them yet. That's annoying but solvable. Don't let tool limitations stop you from writing better queries.
Remember: Your users don't care if you used an ORM or raw SQL. They care that your application is fast and reliable. Use whatever tool gets you there most effectively.
And sometimes, that tool is a WITH clause.
Next in this series:
→ Part 2: Recursive CTEs for hierarchical data
→ Part 3: Using CTEs when your ORM says no
Found this helpful? Leave a comment with your gnarliest nested subquery horror story. We've all got one.
📬 Want essays like this in your inbox?
I just launched a newsletter about thinking clearly in tech — no spam, no fluff.
Subscribe here: https://buttondown.com/efficientlaziness
Efficient Laziness — Think once, well, and move forward.
Top comments (0)