Most database performance problems aren't database problems — they're query problems, index problems, or architecture problems that manifest as database slowdowns. Here's how to build a systematic approach to database performance.
The Performance Investigation Stack
Before optimizing anything, understand where time is actually being spent:
Application Layer
ORM-generated queries (N+1 problem)
Missing connection pooling
Unnecessary round trips
Query Layer
Full table scans
Missing indexes
Inefficient JOINs
Unoptimized LIKE patterns
Infrastructure Layer
Disk I/O contention
Memory pressure
Network latency
CPU saturation
Query Optimization Fundamentals
Reading Query Plans
PostgreSQL: EXPLAIN ANALYZE
MySQL: EXPLAIN
MongoDB: explain()
Look for:
Seq Scan (usually bad — full table scan)
Nested Loop on large datasets (can be expensive)
High actual vs estimated rows (statistics problem)
High execution time in EXPLAIN ANALYZE output
Index Strategy
Not all indexes are created equal.
B-tree indexes — Default. Best for equality and range queries on sortable data.
Partial indexes — Only index rows matching a condition. Example: WHERE is_active = true only indexes active rows.
Composite indexes — Column order matters. Put high-cardinality columns first. Wrong order makes index useless for some queries.
Covering indexes — Include all columns needed by the query so the database never touches the table. Example: CREATE INDEX idx ON orders(user_id, created_at) INCLUDE (total_amount) allows index-only scans.
Common Anti-Patterns
*SELECT ** — Pull only columns you need
Implicit type coercion — WHERE phone = 5551234 when phone is VARCHAR kills index usage
Functions on indexed columns — WHERE YEAR(created_at) = 2026 can't use the index
Pagination without cursor — OFFSET 10000 reads 10,000 rows then discards them
Performance Monitoring Stack
Open Source Tools
pg_stat_statements (PostgreSQL) — Tracks query statistics. Find the slowest and most frequent queries.
Copy
SELECT query, calls, total_exec_time, mean_exec_time, rows
FROM pg_stat_statements
ORDER BY total_exec_time DESC
LIMIT 10;
MySQL Performance Schema — Similar functionality for MySQL.
pt-query-digest (Percona Toolkit) — Analyzes slow query logs across multiple servers.
Key Metrics to Track
Metric Healthy Warning Critical
Query latency p99 < 100ms 100-500ms > 500ms
Connection usage < 50% 50-80% > 80%
Buffer cache hit ratio > 95% 90-95% < 90%
Replication lag < 1s 1-10s > 10s
Caching Strategy
Application-Level Caching
Cache expensive aggregation queries (user counts, dashboard metrics)
Use cache-aside pattern: read from cache first, populate on miss
Set appropriate TTLs — don't cache forever
Database-Level Caching
Redis for session data, hot data, rate limiting
Materialized views for pre-computed aggregations
Read replicas to offload read traffic
Schema Design for Performance
Normalization vs Denormalization
Start normalized. Denormalize only when you have measured evidence.
Signs you might need denormalization:
Same data joined in > 50% of queries
Complex aggregation queries causing CPU spikes
Read/write ratio > 100:1
Partitioning
PostgreSQL supports range and list partitioning. MongoDB has shard keys.
Partition when:
Tables exceed 100GB
Index size exceeds available RAM
Bulk deletes are frequent (partition drop is instant)
Content Strategy for Team Education
If you're responsible for keeping your team sharp on database performance:
Week 1-2: Fundamentals
Query plan reading workshop
Index types and when to use each
Common anti-patterns walkthrough
Week 3-4: Deep Dives
Slow query analysis sessions on real queries
Schema review for new features
Performance review in code deployment pipeline
Ongoing: Culture
Database performance as a first-class engineering concern
Query review in code review process
Monthly performance audit of top 10 slowest queries
The goal is making every engineer understand why indexes matter, how query plans work, and when to ask for help.
Top comments (0)