Finding the Missing Indexes Your Queries Are Begging For

#sql #database #performance #backend

The most common source of missing indexes in production is not carelessness -- it is the development/production gap. A developer adds a new query, tests it against a development database with 1,000 rows where the sequential scan completes in 2ms, and ships it to production where the same table has 50 million rows. The sequential scan now takes 3 seconds. No errors, no crashes, just a slow endpoint that users tolerate until they stop tolerating it.

Finding that one missing index is easy once someone complains. Finding ALL the missing indexes across your entire database -- before users notice -- is the real challenge.

Identifying Tables That Need Indexes

Start by finding tables with disproportionate sequential scan activity:

SELECT
    schemaname, relname AS table_name,
    seq_scan, idx_scan,
    CASE WHEN (seq_scan + idx_scan) > 0
        THEN round(100.0 * seq_scan / (seq_scan + idx_scan), 1)
        ELSE 0
    END AS seq_scan_pct,
    seq_tup_read,
    pg_size_pretty(pg_relation_size(relid)) AS table_size
FROM pg_stat_user_tables
WHERE seq_scan > 100
  AND pg_relation_size(relid) > 10 * 1024 * 1024  -- tables > 10MB
ORDER BY seq_tup_read DESC
LIMIT 15;

Key interpretation points:

Table size matters. A table with 95% sequential scans and 8KB is fine -- PostgreSQL correctly chooses seq scan when the whole table fits in a few pages. A 500MB table with the same ratio almost certainly needs an index.
Use seq_tup_read, not just seq_scan. A table might have few scan events, but each one reads millions of rows. Sort by tuples read to find the highest I/O impact.
Rule of thumb: any table over 100MB with more than 10% sequential scan access warrants investigation.

Finding WHICH Columns to Index

Knowing a table is being sequentially scanned is only half the problem. You need to identify the filter columns from the actual queries. Check pg_stat_statements:

SELECT query, calls, mean_exec_time, rows
FROM pg_stat_statements
WHERE query ILIKE '%orders%'
ORDER BY mean_exec_time * calls DESC
LIMIT 10;

Then confirm with EXPLAIN:

EXPLAIN (COSTS, BUFFERS)
SELECT * FROM orders
WHERE customer_id = 12345 AND status = 'pending'
ORDER BY created_at DESC LIMIT 10;

The output reveals which columns appear in Filter: conditions and Sort: operations -- these are your index candidates.

From Recommendation to Verification

Step 1: Create the index based on query pattern analysis

-- Column order matters: highest selectivity first,
-- equality filters before range, sort columns last
CREATE INDEX CONCURRENTLY idx_orders_customer_status_created
ON orders (customer_id, status, created_at DESC);

Why this column order? customer_id leads because it has the highest selectivity (each customer has few orders relative to the total). status is second for the equality filter. created_at DESC is last to support the ORDER BY without an additional sort step.

Step 2: Verify the planner uses it

EXPLAIN (ANALYZE, BUFFERS)
SELECT * FROM orders
WHERE customer_id = 12345 AND status = 'pending'
ORDER BY created_at DESC LIMIT 10;

Look for Index Scan using idx_orders_customer_status_created. If you still see a sequential scan, update statistics:

ANALYZE orders;

Step 3: Monitor over 24 hours

SELECT indexrelname, idx_scan, idx_tup_read
FROM pg_stat_user_indexes
WHERE relname = 'orders'
  AND indexrelname = 'idx_orders_customer_status_created';

If idx_scan is climbing and the table's seq_scan count has slowed, the index is working. If idx_scan stays at zero, check for type mismatches between query parameters and indexed columns.

Preventing Missing Indexes

The best time to catch missing indexes is immediately after deployment:

Post-release check: After every release, compare pg_stat_user_tables for tables where seq_scan jumped relative to pre-deploy baseline.
Continuous monitoring: Track sequential scan ratios over time, not just during performance incidents.
Development guidelines: Require EXPLAIN output review for any new query on tables over 10 million rows. Catch the problem before it reaches production.

The development/production gap will always exist -- dev databases are small by necessity. The mitigation is catching missing indexes quickly after deployment, not hoping developers predict production query plans.

Full article with monitoring screenshots: Finding the Missing Indexes Your Queries Are Begging For