You pushed a feature last week. Everything looked fine in staging. Then production blew up.
The culprit? A single SQL query that took 3 milliseconds on your laptop with 500 rows — and 14 seconds on production with 2 million rows. Sound familiar?
SQL is one of those skills where the gap between "works" and "works well" is enormous. Most developers learn just enough SQL to get data in and out of a database, then wonder why their app crawls under real-world load.
This guide covers 7 practical optimizations that make a measurable difference. No theory fluff — just patterns you can apply today.
1. Stop Doing SELECT * in Production Code
This is the most common and most forgiven sin in SQL. It looks harmless but it has real costs:
- Transfers more data over the network than needed
- Prevents the database from using covering indexes
- Makes your code fragile when table schema changes
Before:
SELECT * FROM orders WHERE user_id = 42;
After:
SELECT id, status, total_amount, created_at
FROM orders
WHERE user_id = 42;
If your query only needs 4 columns and your table has 30, you just cut data transfer by up to 87%. That adds up fast at scale.
2. Index the Columns You Filter and Join On
An index is a lookup table the database maintains alongside your data. Without it, every query that filters by a column forces a full table scan — reading every single row to find matches.
With 1,000 rows, that is fine. With 10 million rows, it is a disaster.
-- This query is slow without an index on email
SELECT id, name FROM users WHERE email = 'hello@example.com';
-- Add the index
CREATE INDEX idx_users_email ON users(email);
A few things to keep in mind:
- Index columns used in
WHERE,JOIN ON, andORDER BY - Do not index every column — indexes slow down
INSERTandUPDATEoperations - Composite indexes (multiple columns) are powerful but order matters:
(user_id, created_at)helps queries filtering byuser_idor byuser_id + created_at, but not bycreated_atalone
3. Use EXPLAIN (Actually Use It)
Most developers write a query, run it, and if they get results, they move on. The problem is you cannot see if the database is doing something terrible under the hood.
EXPLAIN shows you the query execution plan — how the database intends to fetch your data.
EXPLAIN ANALYZE
SELECT u.name, COUNT(o.id) as order_count
FROM users u
JOIN orders o ON u.id = o.user_id
WHERE o.created_at > '2025-01-01'
GROUP BY u.id, u.name;
Look for these red flags in the output:
- Seq Scan on large tables — means no index is being used
- Hash Join on huge datasets — can be expensive
- rows= estimates wildly off from actual — the planner is confused, usually due to stale statistics
In PostgreSQL, run ANALYZE your_table; periodically to keep statistics fresh so the query planner makes good decisions.
4. Avoid N+1 Queries — Batch Your Reads
The N+1 problem is when your application runs one query to get a list, then one more query per item in that list. It kills performance quietly.
The bad pattern (in pseudocode / ORM):
const users = await db.query('SELECT id FROM users LIMIT 100');
for (const user of users) {
// This runs 100 separate queries!
const orders = await db.query('SELECT * FROM orders WHERE user_id = ?', [user.id]);
}
The fix — use a JOIN or IN clause:
SELECT u.id, u.name, o.id as order_id, o.total_amount
FROM users u
LEFT JOIN orders o ON u.id = o.user_id
WHERE u.id IN (1, 2, 3, ...100 ids)
ORDER BY u.id;
Or batch with a subquery:
SELECT o.*
FROM orders o
WHERE o.user_id IN (
SELECT id FROM users LIMIT 100
);
If you use an ORM like Prisma, Sequelize, or SQLAlchemy, learn how to use eager loading (include, joinedload) to avoid N+1 at the framework level.
5. Use Pagination Correctly (OFFSET is a Trap)
Everyone learns LIMIT and OFFSET for pagination. It works great — until page 500.
Here is what OFFSET 10000 LIMIT 20 actually does: the database reads 10,020 rows, throws away the first 10,000, and returns 20. That is incredibly wasteful.
The better approach: keyset pagination (cursor-based)
Instead of:
SELECT id, title, created_at
FROM posts
ORDER BY created_at DESC
LIMIT 20 OFFSET 200;
Do this:
-- First page
SELECT id, title, created_at
FROM posts
ORDER BY created_at DESC
LIMIT 20;
-- Next page: use the last seen created_at and id
SELECT id, title, created_at
FROM posts
WHERE (created_at, id) < ('2025-12-01 10:00:00', 9482)
ORDER BY created_at DESC
LIMIT 20;
This approach is O(log n) instead of O(n). It stays fast no matter how deep into the dataset you go. It is how most social feeds and infinite scrolls work under the hood.
6. Aggregate in the Database, Not in Your Code
A classic mistake: fetching thousands of rows and computing totals, averages, or counts in application code.
Slow approach:
const orders = await db.query('SELECT total_amount FROM orders WHERE user_id = 42');
const total = orders.reduce((sum, o) => sum + o.total_amount, 0);
Fast approach — let the database do the work:
SELECT SUM(total_amount) as total, COUNT(*) as order_count
FROM orders
WHERE user_id = 42;
Databases are extremely good at aggregation. They have specialized execution paths, can use indexes for this, and avoid transferring huge datasets across the wire. Always prefer pushing computation into the query when you can.
Same applies to filtering: do not fetch everything and filter in code. Use WHERE.
7. Use Connection Pooling (Your App is Probably Not)
Opening a database connection is expensive — it involves network handshakes, authentication, and memory allocation. If your app opens a new connection for every request, you are paying that cost constantly.
Connection pooling keeps a set of connections open and reuses them.
For Node.js with PostgreSQL:
import { Pool } from 'pg';
const pool = new Pool({
connectionString: process.env.DATABASE_URL,
max: 20, // max connections in pool
idleTimeoutMillis: 30000,
connectionTimeoutMillis: 2000,
});
// Reuses connections from the pool
const result = await pool.query('SELECT * FROM users WHERE id = $1', [userId]);
For Python with SQLAlchemy:
from sqlalchemy import create_engine
engine = create_engine(
DATABASE_URL,
pool_size=10,
max_overflow=20,
pool_pre_ping=True # test connections before use
)
If you deploy on serverless (Lambda, Vercel, Cloudflare Workers), you have an additional challenge: each invocation is stateless. Use tools like PgBouncer (external pooler) or Supabase's built-in pooler to handle this properly.
Putting It Together
None of these tricks require exotic tools or infrastructure changes. They are habits.
Here is a quick checklist for your next PR:
- Replace
SELECT *with specific column names - Verify indexes exist on filtered and joined columns
- Run
EXPLAIN ANALYZEon any new non-trivial query - Check for N+1 patterns in any loop that touches the database
- Use cursor-based pagination for large datasets
- Push aggregation and filtering into SQL, not application code
- Confirm your app uses a connection pool
SQL optimization is not about being clever. It is about understanding what the database actually does when it runs your query, and making sure you are not asking it to do unnecessary work.
The best time to think about this is before you hit production. The second-best time is right now.
What is the worst SQL performance bug you have encountered? Drop it in the comments — I read every one.
Top comments (0)