Slow queries are the #1 performance killer. Understanding indexes is the fastest way to fix them. Here's a practical guide.
What Is an Index?
An index is like a book's table of contents. Without it, the database scans every row (full table scan). With it, the database jumps directly to matching rows.
-- Without index: scans all 10 million rows
SELECT * FROM orders WHERE customer_id = 12345;
-- Time: 3200ms
-- With index: jumps to matching rows
CREATE INDEX idx_orders_customer ON orders(customer_id);
-- Same query now: 2ms
Types of Indexes
B-Tree (Default)
Best for equality and range queries:
CREATE INDEX idx_users_email ON users(email);
CREATE INDEX idx_orders_date ON orders(created_at);
-- These queries use the index:
SELECT * FROM users WHERE email = 'alice@example.com';
SELECT * FROM orders WHERE created_at > '2024-01-01';
SELECT * FROM orders WHERE created_at BETWEEN '2024-01-01' AND '2024-06-30';
Composite Indexes
Multiple columns — order matters:
CREATE INDEX idx_orders_status_date ON orders(status, created_at);
-- Uses the index (matches left-to-right):
SELECT * FROM orders WHERE status = 'shipped';
SELECT * FROM orders WHERE status = 'shipped' AND created_at > '2024-01-01';
-- Does NOT use the index efficiently:
SELECT * FROM orders WHERE created_at > '2024-01-01'; -- skips first column
Partial Indexes
Index only rows that matter:
-- Only index active users (much smaller index)
CREATE INDEX idx_active_users ON users(email) WHERE is_active = true;
-- Only index recent orders
CREATE INDEX idx_recent_orders ON orders(created_at) WHERE created_at > '2024-01-01';
GIN Index (for arrays and JSONB)
CREATE INDEX idx_tags ON posts USING GIN(tags);
SELECT * FROM posts WHERE tags @> ARRAY['python', 'tutorial'];
EXPLAIN ANALYZE: Your Best Friend
EXPLAIN ANALYZE SELECT * FROM orders WHERE customer_id = 12345;
-- Output:
-- Index Scan using idx_orders_customer on orders
-- Index Cond: (customer_id = 12345)
-- Planning Time: 0.1ms
-- Execution Time: 0.8ms
Watch for these red flags:
-
Seq Scanon large tables = missing index -
Sortwith high cost = consider index on sort column -
Nested Loopwith many rows = might need a join index
Common Anti-Patterns
-- BAD: Function on indexed column prevents index use
SELECT * FROM users WHERE LOWER(email) = 'alice@example.com';
-- GOOD: Use expression index
CREATE INDEX idx_users_email_lower ON users(LOWER(email));
-- BAD: Leading wildcard can't use index
SELECT * FROM users WHERE name LIKE '%alice%';
-- GOOD: Prefix search uses index
SELECT * FROM users WHERE name LIKE 'alice%';
-- BAD: OR can prevent index use
SELECT * FROM orders WHERE status = 'shipped' OR total > 1000;
-- GOOD: Use UNION
SELECT * FROM orders WHERE status = 'shipped'
UNION ALL
SELECT * FROM orders WHERE total > 1000 AND status != 'shipped';
Index Maintenance
-- Check index usage
SELECT indexrelname, idx_scan, idx_tup_read
FROM pg_stat_user_indexes
ORDER BY idx_scan ASC;
-- Find unused indexes (candidates for removal)
SELECT indexrelname, idx_scan
FROM pg_stat_user_indexes
WHERE idx_scan = 0 AND indexrelname NOT LIKE '%pkey%';
-- Check index size
SELECT pg_size_pretty(pg_indexes_size('orders'));
Python: Checking Query Performance
import time
async def profile_query(db, query, params=None):
# Run EXPLAIN ANALYZE
explain = await db.fetch_one(f"EXPLAIN ANALYZE {query}", params)
print(explain)
# Time the actual query
start = time.perf_counter()
result = await db.fetch_all(query, params)
elapsed = time.perf_counter() - start
print(f"Query returned {len(result)} rows in {elapsed:.3f}s")
return result
Key Takeaways
- Always index columns used in WHERE, JOIN, and ORDER BY
- Composite index column order matters — most selective first
- Use EXPLAIN ANALYZE to verify index usage
- Partial indexes save space and improve performance
- Remove unused indexes — they slow down writes
6. Functions on indexed columns prevent index use — use expression indexes
🚀 Level up your AI workflow! Check out my AI Developer Mega Prompt Pack — 80 battle-tested prompts for developers. $9.99
Top comments (0)