Teguh Coding

Posted on Mar 1

Stop Writing Slow SQL: 7 Query Optimization Tricks Every Developer Should Know

#sql #database #webdev #performance

You pushed a feature last week. Everything looked fine in staging. Then production blew up.

The culprit? A single SQL query that took 3 milliseconds on your laptop with 500 rows — and 14 seconds on production with 2 million rows. Sound familiar?

SQL is one of those skills where the gap between "works" and "works well" is enormous. Most developers learn just enough SQL to get data in and out of a database, then wonder why their app crawls under real-world load.

This guide covers 7 practical optimizations that make a measurable difference. No theory fluff — just patterns you can apply today.

1. Stop Doing SELECT * in Production Code

This is the most common and most forgiven sin in SQL. It looks harmless but it has real costs:

Transfers more data over the network than needed
Prevents the database from using covering indexes
Makes your code fragile when table schema changes

Before:

SELECT * FROM orders WHERE user_id = 42;

After:

SELECT id, status, total_amount, created_at
FROM orders
WHERE user_id = 42;

If your query only needs 4 columns and your table has 30, you just cut data transfer by up to 87%. That adds up fast at scale.

2. Index the Columns You Filter and Join On

An index is a lookup table the database maintains alongside your data. Without it, every query that filters by a column forces a full table scan — reading every single row to find matches.

With 1,000 rows, that is fine. With 10 million rows, it is a disaster.

-- This query is slow without an index on email
SELECT id, name FROM users WHERE email = 'hello@example.com';

-- Add the index
CREATE INDEX idx_users_email ON users(email);

A few things to keep in mind:

Index columns used in WHERE, JOIN ON, and ORDER BY
Do not index every column — indexes slow down INSERT and UPDATE operations
Composite indexes (multiple columns) are powerful but order matters: (user_id, created_at) helps queries filtering by user_id or by user_id + created_at, but not by created_at alone

3. Use EXPLAIN (Actually Use It)

Most developers write a query, run it, and if they get results, they move on. The problem is you cannot see if the database is doing something terrible under the hood.

EXPLAIN shows you the query execution plan — how the database intends to fetch your data.

EXPLAIN ANALYZE
SELECT u.name, COUNT(o.id) as order_count
FROM users u
JOIN orders o ON u.id = o.user_id
WHERE o.created_at > '2025-01-01'
GROUP BY u.id, u.name;

Look for these red flags in the output:

Seq Scan on large tables — means no index is being used
Hash Join on huge datasets — can be expensive
rows= estimates wildly off from actual — the planner is confused, usually due to stale statistics

In PostgreSQL, run ANALYZE your_table; periodically to keep statistics fresh so the query planner makes good decisions.

4. Avoid N+1 Queries — Batch Your Reads

The N+1 problem is when your application runs one query to get a list, then one more query per item in that list. It kills performance quietly.

The bad pattern (in pseudocode / ORM):

const users = await db.query('SELECT id FROM users LIMIT 100');

for (const user of users) {
  // This runs 100 separate queries!
  const orders = await db.query('SELECT * FROM orders WHERE user_id = ?', [user.id]);
}

The fix — use a JOIN or IN clause:

SELECT u.id, u.name, o.id as order_id, o.total_amount
FROM users u
LEFT JOIN orders o ON u.id = o.user_id
WHERE u.id IN (1, 2, 3, ...100 ids)
ORDER BY u.id;

Or batch with a subquery:

SELECT o.*
FROM orders o
WHERE o.user_id IN (
  SELECT id FROM users LIMIT 100
);

If you use an ORM like Prisma, Sequelize, or SQLAlchemy, learn how to use eager loading (include, joinedload) to avoid N+1 at the framework level.

5. Use Pagination Correctly (OFFSET is a Trap)

Everyone learns LIMIT and OFFSET for pagination. It works great — until page 500.

Here is what OFFSET 10000 LIMIT 20 actually does: the database reads 10,020 rows, throws away the first 10,000, and returns 20. That is incredibly wasteful.

The better approach: keyset pagination (cursor-based)

Instead of:

SELECT id, title, created_at
FROM posts
ORDER BY created_at DESC
LIMIT 20 OFFSET 200;

Do this:

-- First page
SELECT id, title, created_at
FROM posts
ORDER BY created_at DESC
LIMIT 20;

-- Next page: use the last seen created_at and id
SELECT id, title, created_at
FROM posts
WHERE (created_at, id) < ('2025-12-01 10:00:00', 9482)
ORDER BY created_at DESC
LIMIT 20;

This approach is O(log n) instead of O(n). It stays fast no matter how deep into the dataset you go. It is how most social feeds and infinite scrolls work under the hood.

6. Aggregate in the Database, Not in Your Code

A classic mistake: fetching thousands of rows and computing totals, averages, or counts in application code.

Slow approach:

const orders = await db.query('SELECT total_amount FROM orders WHERE user_id = 42');
const total = orders.reduce((sum, o) => sum + o.total_amount, 0);

Fast approach — let the database do the work:

SELECT SUM(total_amount) as total, COUNT(*) as order_count
FROM orders
WHERE user_id = 42;

Databases are extremely good at aggregation. They have specialized execution paths, can use indexes for this, and avoid transferring huge datasets across the wire. Always prefer pushing computation into the query when you can.

Same applies to filtering: do not fetch everything and filter in code. Use WHERE.

7. Use Connection Pooling (Your App is Probably Not)

Opening a database connection is expensive — it involves network handshakes, authentication, and memory allocation. If your app opens a new connection for every request, you are paying that cost constantly.

Connection pooling keeps a set of connections open and reuses them.

For Node.js with PostgreSQL:

import { Pool } from 'pg';

const pool = new Pool({
  connectionString: process.env.DATABASE_URL,
  max: 20,           // max connections in pool
  idleTimeoutMillis: 30000,
  connectionTimeoutMillis: 2000,
});

// Reuses connections from the pool
const result = await pool.query('SELECT * FROM users WHERE id = $1', [userId]);

For Python with SQLAlchemy:

from sqlalchemy import create_engine

engine = create_engine(
    DATABASE_URL,
    pool_size=10,
    max_overflow=20,
    pool_pre_ping=True  # test connections before use
)

If you deploy on serverless (Lambda, Vercel, Cloudflare Workers), you have an additional challenge: each invocation is stateless. Use tools like PgBouncer (external pooler) or Supabase's built-in pooler to handle this properly.

Putting It Together

None of these tricks require exotic tools or infrastructure changes. They are habits.

Here is a quick checklist for your next PR:

Replace SELECT * with specific column names
Verify indexes exist on filtered and joined columns
Run EXPLAIN ANALYZE on any new non-trivial query
Check for N+1 patterns in any loop that touches the database
Use cursor-based pagination for large datasets
Push aggregation and filtering into SQL, not application code
Confirm your app uses a connection pool

SQL optimization is not about being clever. It is about understanding what the database actually does when it runs your query, and making sure you are not asking it to do unnecessary work.

The best time to think about this is before you hit production. The second-best time is right now.

What is the worst SQL performance bug you have encountered? Drop it in the comments — I read every one.

DEV Community