Philip McClarence

Posted on May 5

PostgreSQL Query Anti-Patterns and Common Mistakes

#database #performance #postgres #sql

Most of the articles in this series are about making reasonable queries faster. This one is about queries that are wrong by construction — patterns that account for most production SQL performance incidents, not because the query is subtle, but because the pattern is near-universal and the fix is well-known to people who've seen it before. If you recognise these in your codebase, fixing them is almost always a straightforward win.

Each anti-pattern below maps to a specific plan signature or log symptom; cross-references to deeper treatment are inline.

1. N+1 queries — the ORM default

The single most common performance problem in applications talking to PostgreSQL. The pattern is one query to fetch a list of parent rows, plus one query per parent to fetch related data:

# One query — fetches 500 orders.
orders = db.query("SELECT * FROM orders WHERE status = 'pending'")

# Five hundred more queries — one per order.
for order in orders:
    items = db.query(
        "SELECT * FROM order_items WHERE order_id = %s",
        order.id,
    )

501 network round trips. 501 query plans. 501 parse-and-execute cycles. Even if each individual query is 2 ms, you've burned a full second of wall-clock time on a dashboard query that could have been one SQL statement.

Fix: a single JOIN.

SELECT o.order_id, o.status, o.created_at,
       oi.item_id, oi.quantity, oi.unit_price_cents
FROM sim_bp_orders o
JOIN sim_bp_order_items oi ON oi.order_id = o.order_id
WHERE o.status = 'pending';

One query. The application then groups the result by order_id to reconstruct the parent-children shape.

Detecting N+1 from PostgreSQL side: pg_stat_statements will show many rows of the same normalised order_items query with different parameter values. Most modern ORMs can eager-load related rows with a single JOIN when you ask — in SQLAlchemy it's joinedload(); in ActiveRecord, .includes(:items); in Django, prefetch_related(). Usually a one-line change.

2. `SELECT *` on wide tables in hot paths

Every column the query returns has a cost: bytes read from heap, bytes serialised over the wire, bytes deserialised on the client. A 400-column table with SELECT * returns hundreds of bytes per row that the application typically discards.

The worse consequence is plan quality. An Index Only Scan can only be chosen when every referenced column is available from the index — SELECT * forces heap fetches unconditionally. A covering index with INCLUDE becomes useless because the planner still has to visit the heap for the columns not in the index.

Fix: name the columns you actually need. The resulting plan can use covering indexes, narrower sort keys, and tighter network payload.

Where it's OK to keep SELECT *: database-admin queries, dump scripts, and CTEs that genuinely need every column. In production application code, it's almost always wrong.

3. Implicit type casts that disable indexes

Covered in detail in the WHERE clause article. The symptom: an index exists on the filtered column, but the plan shows Seq Scan with the filter appearing in a Filter: line instead of Index Cond:.

-- text column compared against integer literal.
SELECT * FROM t WHERE text_col = 123;

PostgreSQL coerces the text column to an int for the comparison, which disables the index. The three fix patterns:

Use the correct literal type: WHERE text_col = '123'.
Cast the literal explicitly: WHERE text_col = 123::text.
If the column semantically should be an int, change the schema.

This anti-pattern also appears after schema migrations — a column type change from varchar(N) to citext or a domain type can silently shift which side of the comparison gets cast. Always re-run EXPLAIN on the key queries after any column-type change.

4. Functions on indexed columns

The other half of the non-sargable family. Wrapping an indexed column in a function disables the index:

SELECT * FROM sim_bp_users WHERE lower(email) = '...';
SELECT * FROM sim_bp_orders WHERE date_trunc('day', created_at) = '2024-01-15';

On our dataset, WHERE lower(email) LIKE 'user12%' produces a parallel sequential scan at 122 ms. The same query against the plain column with a pattern-ops index takes 25 ms.

Fixes:

Normalise on write — store emails lowercased, dates as dates not timestamps, and so on.
Expression index that computes the same function — CREATE INDEX ON sim_bp_users (lower(email)).
Rewrite the predicate to avoid the function — WHERE created_at >= '2024-01-15' AND created_at < '2024-01-16' instead of date_trunc('day', created_at) = '2024-01-15'.

5. Missing `LIMIT` on exploratory joins

A query written in development against a 100-row test table can become a 10-million-row query in production. Joins without a LIMIT are the most common source of surprise-large result sets — a new feature ships, the WHERE clause on the production dataset matches a million rows instead of a hundred, and the application silently returns a million rows per request.

-- No LIMIT — returns however many rows happen to match.
SELECT o.*, u.email
FROM sim_bp_orders o JOIN sim_bp_users u ON u.user_id = o.user_id
WHERE o.status = 'pending';

Fix: add a LIMIT. For user-facing queries, paginate (ideally with keyset — see query rewriting techniques). For background jobs, chunk the processing.

This is also a security posture question. A query without a LIMIT exposed through an API endpoint gives an attacker a trivial amplification attack — one HTTP request → one full-table read.

6. One-row-at-a-time INSERTs

Bulk-loading data with a loop of INSERT INTO t VALUES ($1, $2, $3) calls is common in migrations, ETL jobs, and CSV importers. Each statement is a full round trip plus a plan-and-execute cycle; for a million rows, that's a million of each.

Fixes, roughly in order of preference:

COPY FROM STDIN — PostgreSQL's bulk-load protocol. Orders of magnitude faster than INSERT because it avoids per-row parsing and planning. Most drivers expose it.
Multi-row VALUES — INSERT INTO t (...) VALUES ($1, ...), ($2, ...), ($3, ...) packs multiple rows into one statement. Ten to a hundred per call is a sweet spot.
INSERT INTO t SELECT ... FROM source — if the source data is already in the database, do the load in SQL and skip the client round-trip entirely.

Wrap the bulk operation in a single transaction so WAL flushes happen at commit, not per row. synchronous_commit = off for the loading session is safe for non-durable data and further speeds things up.

7. Keeping transactions open

Transactions hold locks and prevent vacuum cleanup on the tables they've read. A transaction left open for an hour — usually by a stuck batch job or a long-running analytical query — blocks autovacuum across the whole database for the duration. Symptoms: growing table bloat, mysterious missed vacuum schedules, XID-wraparound warnings on busy databases.

The specific antipattern is usually application-level:

# Accidentally nests transactions; if one step hangs, the whole transaction hangs.
with db.transaction():
    rows = db.query("SELECT ...")  # reads a snapshot.
    for row in rows:
        call_external_api(row)  # could block for minutes.
        db.execute("UPDATE ...")

Fix: keep transactions short. Never call out to external services inside a transaction. Batch the reads outside the transaction; inside the transaction, do only the minimum set of SQL changes that need to be atomic.

pg_stat_activity shows xact_start for every running transaction; any row where now() - xact_start > interval '5 minutes' is worth investigating. The idle in transaction state is the most common failure mode.

8. Storing dates, numbers, or booleans as strings

Every time a type-mismatched predicate runs, the planner either (a) casts the literal to text (sargable, OK) or (b) casts the column to the right type (not sargable, disables index). The schema wins: if the column is created_at text, every query that filters it numerically or by date pays a non-sargable cost.

-- created_at stored as text like '2024-01-15 14:22:00'.
SELECT * FROM logs
WHERE created_at::timestamp > now() - interval '1 hour';  -- non-sargable.

Fix: migrate to the right type. ALTER TABLE ... ALTER COLUMN ... TYPE timestamptz USING created_at::timestamptz; — takes a brief ACCESS EXCLUSIVE lock, so pick a quiet moment; for very large tables do an online migration (new column, backfill, switch, drop old).

Indexes on the right type are smaller, faster, and actually usable by predicates. Indexes on text-representations of dates have all the downsides of neither — locale-sensitive comparison, extra parse cost, and usually the wrong sort order for natural ranges.

9. `count(*)` as a cheap operation

Applications often display "total records: 847,291" somewhere, computed with SELECT count(*) FROM big_table. On PostgreSQL, that's a full scan — the storage layer doesn't maintain row counts because of MVCC visibility. A 100-million-row count(*) takes seconds to minutes.

Fixes:

Accept an approximate count. pg_class.reltuples is the planner's cached estimate — good within a few percent on recently-analyzed tables, and free. SELECT reltuples::bigint FROM pg_class WHERE relname = 'big_table'.
Trigger-maintained counter table. If you truly need an exact count and it's shown on every page load, maintain it with an AFTER INSERT/DELETE trigger that updates a tiny summary table.
Avoid the need for a count. Pagination UIs that show "Page 127 of 3,482,195" rarely benefit from the total; "Page 127" with Next/Previous buttons is enough.

10. Ignoring `pg_stat_statements`

The extension that tracks query statistics — most-expensive queries, average time per execution, total time across all runs. On any non-toy database it's the single most useful diagnostic tool.

CREATE EXTENSION pg_stat_statements;

-- Top 10 queries by total time:
SELECT query,
       calls,
       total_exec_time,
       mean_exec_time,
       rows
FROM pg_stat_statements
ORDER BY total_exec_time DESC
LIMIT 10;

Total time (not per-call time) is the right sort: a 2 ms query called a million times is a bigger target than a 5 s query called once a day.

Set pg_stat_statements.track = all (not the default top) to capture statements inside functions too. On cloud-managed instances, this is often already on by default.

Catching anti-patterns automatically

Many of these patterns show up as specific plan shapes:

seq_scan_large → patterns 3, 4, 5 (non-sargable predicates, missing indexes).
excessive_filter_rows → pattern 5 (missing LIMIT, wide filter).
nested_loop_large → pattern 1 (N+1 at plan level, or missing join index).
no_index_usage → patterns 3, 4, 8 (index never usable because of casts or types).
hash_batches_spill, sort_on_disk, temp_blocks_written → work_mem / aggregate patterns from earlier articles.

The workflow is the same every time: capture a plan, identify the category, apply the fix, verify. Most of the value is in recognising the category quickly — the fixes are standard patterns once the category is clear. The pillar guide ties the full series together.