PostgreSQL is incredibly powerful, but even experienced developers write queries that silently destroy performance. Here are the 10 most common mistakes and how to fix each one.
1. Using SELECT * Instead of Specific Columns
Bad SQL:
SELECT * FROM orders WHERE customer_id = 42;
Good SQL:
SELECT order_id, total_amount, created_at
FROM orders
WHERE customer_id = 42;
Why: SELECT * bloats I/O, defeats index-only scans, and wastes bandwidth. Selecting specific columns lets PostgreSQL use an index-only scan without touching the heap.
Typical speedup: 2-10x
2. Missing Indexes on WHERE and JOIN Columns
Bad SQL:
-- No index on orders.customer_id
SELECT order_id, total_amount
FROM orders
WHERE customer_id = 42;
Good SQL:
CREATE INDEX idx_orders_customer_id ON orders (customer_id);
SELECT order_id, total_amount
FROM orders
WHERE customer_id = 42;
Why: Without an index, PostgreSQL scans every row in the table. On a 10M-row table, that means scanning all rows to find maybe 50 matches. An index turns this into a B-tree lookup touching a handful of pages.
Typical speedup: 100-10,000x on large tables. The single biggest performance win.
3. OFFSET Pagination vs Keyset Pagination
Bad SQL:
SELECT id, title, created_at
FROM articles
ORDER BY created_at DESC
LIMIT 20 OFFSET 9980;
Good SQL:
SELECT id, title, created_at
FROM articles
WHERE created_at < '2024-01-15 10:30:00'
ORDER BY created_at DESC
LIMIT 20;
Why: OFFSET forces PostgreSQL to fetch and discard 9,980 rows before returning your 20. It is O(n) where n is the offset. Keyset pagination uses a WHERE clause to jump directly to the right position - O(1) regardless of depth.
Typical speedup: 10-1000x on deep pages
4. N+1 Queries vs JOINs
Bad SQL (application code pattern):
SELECT id, name FROM customers LIMIT 100;
-- Then 100 separate queries:
SELECT SUM(total_amount) FROM orders WHERE customer_id = 1;
SELECT SUM(total_amount) FROM orders WHERE customer_id = 2;
-- ... 98 more
Good SQL:
SELECT c.id, c.name, COALESCE(SUM(o.total_amount), 0) AS total_spent
FROM customers c
LEFT JOIN orders o ON o.customer_id = c.id
GROUP BY c.id, c.name
LIMIT 100;
Why: The N+1 pattern sends 101 separate queries with individual round-trips. A single JOIN lets PostgreSQL optimize the entire operation with hash joins, merge joins, or nested loops.
Typical speedup: 5-50x
5. Not Using EXPLAIN ANALYZE
Bad approach:
-- Guessing which indexes to add
CREATE INDEX idx_whatever ON orders (status);
CREATE INDEX idx_whatever2 ON orders (status, customer_id);
Good approach:
EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT)
SELECT o.order_id, c.name
FROM orders o
JOIN customers c ON c.id = o.customer_id
WHERE o.status = 'pending'
AND o.created_at > NOW() - INTERVAL '7 days';
Why: EXPLAIN ANALYZE executes the query and shows exactly what PostgreSQL did: which indexes it used, row estimate accuracy, and where time was spent. Look for Seq Scans on large tables, estimate vs actual row divergence, and expensive sorts. Stop guessing, start measuring.
Typical speedup: Tells you exactly what to fix.
6. LIKE with Leading Wildcards vs pg_trgm
Bad SQL:
SELECT id, name FROM products
WHERE name LIKE '%%wireless%%';
Good SQL:
CREATE EXTENSION IF NOT EXISTS pg_trgm;
CREATE INDEX idx_products_name_trgm
ON products USING gin (name gin_trgm_ops);
SELECT id, name FROM products
WHERE name LIKE '%%wireless%%';
Why: B-tree indexes only help with prefix matching (LIKE 'wireless%%'). A leading wildcard forces a sequential scan. The pg_trgm extension builds a GIN index over trigrams, enabling index usage for substring, ILIKE, and regex matching.
Typical speedup: 50-500x on large text columns
7. COUNT(*) for Existence Checks vs EXISTS
Bad SQL:
SELECT CASE WHEN COUNT(*) > 0 THEN true ELSE false END
FROM orders
WHERE customer_id = 42 AND status = 'pending';
Good SQL:
SELECT EXISTS (
SELECT 1 FROM orders
WHERE customer_id = 42 AND status = 'pending'
);
Why: COUNT(*) scans and counts every matching row. If there are 10,000 matches, it counts all 10,000 just to check if any exist. EXISTS returns true on the first match and stops - a short-circuit operation.
Typical speedup: 10-10,000x depending on match count
8. Single-Row INSERTs vs Batching / COPY
Bad SQL:
INSERT INTO events (type, data, created_at) VALUES ('click', '{}', NOW());
INSERT INTO events (type, data, created_at) VALUES ('view', '{}', NOW());
-- 9,998 more individual statements
Good SQL:
-- Multi-value INSERT
INSERT INTO events (type, data, created_at) VALUES
('click', '{}', NOW()),
('view', '{}', NOW()),
('scroll', '{}', NOW());
-- Or COPY for bulk loads
COPY events (type, data, created_at)
FROM STDIN WITH (FORMAT csv);
Why: Each INSERT has overhead: network round-trip, SQL parsing, query planning, WAL logging, and transaction commit. Batching and COPY amortize this dramatically.
Typical speedup: 10-100x for batching, 100-1000x for COPY
9. No Connection Pooling (Use PgBouncer)
Bad approach: Opening a new database connection per request. Each connection forks a new PostgreSQL backend process and allocates ~10MB of memory.
Good approach: Use PgBouncer with transaction-mode pooling:
[databases]
mydb = host=localhost port=5432 dbname=mydb
[pgbouncer]
listen_port = 6432
pool_mode = transaction
max_client_conn = 1000
default_pool_size = 20
Why: Connection creation is expensive. Under load, hundreds of connections create memory pressure and context-switching overhead. PgBouncer maintains a small pool of actual connections serving thousands of clients, and protects against connection storms during traffic spikes.
Typical speedup: 2-20x under concurrent load
10. Not Using Partial Indexes
Bad SQL:
CREATE INDEX idx_orders_status ON orders (status);
SELECT * FROM orders WHERE status = 'failed';
Good SQL:
CREATE INDEX idx_orders_failed ON orders (created_at)
WHERE status = 'failed';
SELECT * FROM orders
WHERE status = 'failed'
ORDER BY created_at DESC LIMIT 10;
Why: If 95%% of orders are completed and only 1%% are failed, a full index on status is massive but mostly useless. A partial index with WHERE status = 'failed' indexes only the 1%% you query. It is dramatically smaller, fits in cache, and is faster to scan and maintain.
Use partial indexes for: status flags, soft deletes, queue tables, and any skewed distribution column.
Typical speedup: 3-10x with significantly less disk space and write overhead
Summary
| # | Mistake | Fix | Speedup |
|---|---|---|---|
| 1 | SELECT * | Specific columns | 2-10x |
| 2 | Missing indexes | Targeted indexes | 100-10,000x |
| 3 | OFFSET pagination | Keyset pagination | 10-1,000x |
| 4 | N+1 queries | JOINs | 5-50x |
| 5 | Blind optimization | EXPLAIN ANALYZE | Enables all fixes |
| 6 | LIKE leading wildcard | pg_trgm GIN index | 50-500x |
| 7 | COUNT(*) existence | EXISTS | 10-10,000x |
| 8 | Single-row INSERTs | Batch / COPY | 10-1,000x |
| 9 | No pooling | PgBouncer | 2-20x |
| 10 | Full indexes | Partial indexes | 3-10x |
Every optimization follows one principle: give PostgreSQL less work to do. Start with EXPLAIN ANALYZE on your slowest queries today.
If this article helped you, consider buying me a coffee on Ko-fi! Follow me for more production backend patterns.
Top comments (0)