The most common source of missing indexes in production is not carelessness -- it is the development/production gap. A developer adds a new query, tests it against a development database with 1,000 rows where the sequential scan completes in 2ms, and ships it to production where the same table has 50 million rows. The sequential scan now takes 3 seconds. No errors, no crashes, just a slow endpoint that users tolerate until they stop tolerating it.
Finding that one missing index is easy once someone complains. Finding ALL the missing indexes across your entire database -- before users notice -- is the real challenge.
Identifying Tables That Need Indexes
Start by finding tables with disproportionate sequential scan activity:
SELECT
schemaname, relname AS table_name,
seq_scan, idx_scan,
CASE WHEN (seq_scan + idx_scan) > 0
THEN round(100.0 * seq_scan / (seq_scan + idx_scan), 1)
ELSE 0
END AS seq_scan_pct,
seq_tup_read,
pg_size_pretty(pg_relation_size(relid)) AS table_size
FROM pg_stat_user_tables
WHERE seq_scan > 100
AND pg_relation_size(relid) > 10 * 1024 * 1024 -- tables > 10MB
ORDER BY seq_tup_read DESC
LIMIT 15;
Key interpretation points:
- Table size matters. A table with 95% sequential scans and 8KB is fine -- PostgreSQL correctly chooses seq scan when the whole table fits in a few pages. A 500MB table with the same ratio almost certainly needs an index.
-
Use
seq_tup_read, not justseq_scan. A table might have few scan events, but each one reads millions of rows. Sort by tuples read to find the highest I/O impact. - Rule of thumb: any table over 100MB with more than 10% sequential scan access warrants investigation.
Finding WHICH Columns to Index
Knowing a table is being sequentially scanned is only half the problem. You need to identify the filter columns from the actual queries. Check pg_stat_statements:
SELECT query, calls, mean_exec_time, rows
FROM pg_stat_statements
WHERE query ILIKE '%orders%'
ORDER BY mean_exec_time * calls DESC
LIMIT 10;
Then confirm with EXPLAIN:
EXPLAIN (COSTS, BUFFERS)
SELECT * FROM orders
WHERE customer_id = 12345 AND status = 'pending'
ORDER BY created_at DESC LIMIT 10;
The output reveals which columns appear in Filter: conditions and Sort: operations -- these are your index candidates.
From Recommendation to Verification
Step 1: Create the index based on query pattern analysis
-- Column order matters: highest selectivity first,
-- equality filters before range, sort columns last
CREATE INDEX CONCURRENTLY idx_orders_customer_status_created
ON orders (customer_id, status, created_at DESC);
Why this column order? customer_id leads because it has the highest selectivity (each customer has few orders relative to the total). status is second for the equality filter. created_at DESC is last to support the ORDER BY without an additional sort step.
Step 2: Verify the planner uses it
EXPLAIN (ANALYZE, BUFFERS)
SELECT * FROM orders
WHERE customer_id = 12345 AND status = 'pending'
ORDER BY created_at DESC LIMIT 10;
Look for Index Scan using idx_orders_customer_status_created. If you still see a sequential scan, update statistics:
ANALYZE orders;
Step 3: Monitor over 24 hours
SELECT indexrelname, idx_scan, idx_tup_read
FROM pg_stat_user_indexes
WHERE relname = 'orders'
AND indexrelname = 'idx_orders_customer_status_created';
If idx_scan is climbing and the table's seq_scan count has slowed, the index is working. If idx_scan stays at zero, check for type mismatches between query parameters and indexed columns.
Preventing Missing Indexes
The best time to catch missing indexes is immediately after deployment:
-
Post-release check: After every release, compare
pg_stat_user_tablesfor tables whereseq_scanjumped relative to pre-deploy baseline. - Continuous monitoring: Track sequential scan ratios over time, not just during performance incidents.
-
Development guidelines: Require
EXPLAINoutput review for any new query on tables over 10 million rows. Catch the problem before it reaches production.
The development/production gap will always exist -- dev databases are small by necessity. The mitigation is catching missing indexes quickly after deployment, not hoping developers predict production query plans.
Full article with monitoring screenshots: Finding the Missing Indexes Your Queries Are Begging For
Top comments (0)