I have seen the same incident pattern play out at three different companies: someone drops an index from the PostgreSQL primary because pg_stat_user_indexes shows zero scans. Within minutes, the read replica workload collapses. Dashboards time out. Analytics queries that depended on that index are now doing full table scans on million-row tables.
The root cause is always the same -- pg_stat_user_indexes only reports usage statistics for the node you are connected to. In any primary/replica setup, that is only half the picture.
Why Single-Node Index Audits Are Dangerous
In most production setups, primaries and replicas serve different workloads:
- Primary: transactional writes, inserts, updates, point lookups by primary key
- Replicas: analytical reads, aggregations, filtered scans, joins across large tables
The indexes each workload needs are fundamentally different. An index that is never used for writes on the primary may be critical for reads on the replica. Checking only the primary creates a false sense of safety.
It gets worse with multiple replicas. A reporting replica might use idx_orders_created_at heavily while a search replica never touches it but relies on idx_orders_fulltext. You need stats from every node to make safe drop decisions.
Finding Unused Indexes (Single Node)
On a single node, the query is straightforward:
SELECT
schemaname, relname AS table_name,
indexrelname AS index_name,
idx_scan AS scans_since_reset,
pg_size_pretty(pg_relation_size(indexrelid)) AS index_size,
pg_stat_get_last_reset_time() AS stats_reset_time
FROM pg_stat_user_indexes
WHERE idx_scan = 0
AND schemaname NOT IN ('pg_catalog', 'pg_toast')
ORDER BY pg_relation_size(indexrelid) DESC;
The stats_reset_time is critical context. If statistics were reset yesterday, zero scans might just mean the index has not been needed in the last 24 hours. You need weeks of data for a confident decision.
The limitation is that this only reflects the current node's workload. To verify across replicas:
-- Run on EACH replica
SELECT indexrelname, idx_scan
FROM pg_stat_user_indexes
WHERE indexrelname = 'idx_orders_customer_region';
Detecting Duplicate Indexes
ORM migrations frequently create indexes without checking for existing ones. Find duplicates by comparing column sets:
SELECT array_agg(indexrelid::regclass) AS duplicate_indexes,
indrelid::regclass AS table_name,
array_agg(indexrelid::regclass ORDER BY indexrelid) AS index_names
FROM pg_index
GROUP BY indrelid, indkey
HAVING count(*) > 1;
Two indexes on the same columns with different names serve no purpose. The duplicate wastes disk space and slows every write operation on the table.
The Safe Cleanup Priority Order
1. Duplicates first -- always safe because they are functionally identical:
-- Two indexes on orders(customer_id) — drop the one without a unique constraint
DROP INDEX CONCURRENTLY idx_orders_customer_id_v2;
-- Space reclaimed: 1.2 GB
2. Unused indexes -- confirmed unused across ALL nodes, with 30+ days of statistics:
-- Unused on primary AND all replicas, stats window > 30 days
DROP INDEX CONCURRENTLY idx_orders_legacy_status;
-- Space reclaimed: 340 MB
3. Soft-disable when uncertain -- make an index invisible to the planner without dropping it:
-- "Soft disable" — make the index invisible to the planner
UPDATE pg_index SET indisvalid = false
WHERE indexrelid = 'idx_orders_legacy_status'::regclass;
-- Re-enable if needed
UPDATE pg_index SET indisvalid = true
WHERE indexrelid = 'idx_orders_legacy_status'::regclass;
REINDEX INDEX CONCURRENTLY idx_orders_legacy_status;
4. Overlapping indexes -- require more analysis. If you have both idx_orders_customer_id and idx_orders_customer_id_created_at, the single-column index is redundant for equality lookups (the composite index's leading column serves the same purpose). But if queries frequently sort by only customer_id, the narrower index may produce smaller scans. Check query patterns first.
Monitoring Best Practices
- Run index audits monthly at minimum, and always across the entire cluster
- Establish a team rule: no index is dropped without checking replica usage
- Track total index size as a percentage of table size -- a healthy ratio is 30-50%
- If indexes consume more space than the table data itself, you almost certainly have redundancies worth cleaning up
The 5 minutes spent checking replica stats can save hours of incident response when an analytics dashboard goes down after an index drop.
Full article with cluster-aware monitoring examples: The Index Advisor That Checks Your Replicas Before Recommending a DROP
Top comments (0)