DEV Community

郑沛沛
郑沛沛

Posted on

Database Indexing Explained: How to Make Your Queries 1000x Faster

Slow queries are the #1 performance killer. Understanding indexes is the fastest way to fix them. Here's a practical guide.

What Is an Index?

An index is like a book's table of contents. Without it, the database scans every row (full table scan). With it, the database jumps directly to matching rows.

-- Without index: scans all 10 million rows
SELECT * FROM orders WHERE customer_id = 12345;
-- Time: 3200ms

-- With index: jumps to matching rows
CREATE INDEX idx_orders_customer ON orders(customer_id);
-- Same query now: 2ms
Enter fullscreen mode Exit fullscreen mode

Types of Indexes

B-Tree (Default)

Best for equality and range queries:

CREATE INDEX idx_users_email ON users(email);
CREATE INDEX idx_orders_date ON orders(created_at);

-- These queries use the index:
SELECT * FROM users WHERE email = 'alice@example.com';
SELECT * FROM orders WHERE created_at > '2024-01-01';
SELECT * FROM orders WHERE created_at BETWEEN '2024-01-01' AND '2024-06-30';
Enter fullscreen mode Exit fullscreen mode

Composite Indexes

Multiple columns — order matters:

CREATE INDEX idx_orders_status_date ON orders(status, created_at);

-- Uses the index (matches left-to-right):
SELECT * FROM orders WHERE status = 'shipped';
SELECT * FROM orders WHERE status = 'shipped' AND created_at > '2024-01-01';

-- Does NOT use the index efficiently:
SELECT * FROM orders WHERE created_at > '2024-01-01';  -- skips first column
Enter fullscreen mode Exit fullscreen mode

Partial Indexes

Index only rows that matter:

-- Only index active users (much smaller index)
CREATE INDEX idx_active_users ON users(email) WHERE is_active = true;

-- Only index recent orders
CREATE INDEX idx_recent_orders ON orders(created_at) WHERE created_at > '2024-01-01';
Enter fullscreen mode Exit fullscreen mode

GIN Index (for arrays and JSONB)

CREATE INDEX idx_tags ON posts USING GIN(tags);
SELECT * FROM posts WHERE tags @> ARRAY['python', 'tutorial'];
Enter fullscreen mode Exit fullscreen mode

EXPLAIN ANALYZE: Your Best Friend

EXPLAIN ANALYZE SELECT * FROM orders WHERE customer_id = 12345;

-- Output:
-- Index Scan using idx_orders_customer on orders
--   Index Cond: (customer_id = 12345)
--   Planning Time: 0.1ms
--   Execution Time: 0.8ms
Enter fullscreen mode Exit fullscreen mode

Watch for these red flags:

  • Seq Scan on large tables = missing index
  • Sort with high cost = consider index on sort column
  • Nested Loop with many rows = might need a join index

Common Anti-Patterns

-- BAD: Function on indexed column prevents index use
SELECT * FROM users WHERE LOWER(email) = 'alice@example.com';

-- GOOD: Use expression index
CREATE INDEX idx_users_email_lower ON users(LOWER(email));

-- BAD: Leading wildcard can't use index
SELECT * FROM users WHERE name LIKE '%alice%';

-- GOOD: Prefix search uses index
SELECT * FROM users WHERE name LIKE 'alice%';

-- BAD: OR can prevent index use
SELECT * FROM orders WHERE status = 'shipped' OR total > 1000;

-- GOOD: Use UNION
SELECT * FROM orders WHERE status = 'shipped'
UNION ALL
SELECT * FROM orders WHERE total > 1000 AND status != 'shipped';
Enter fullscreen mode Exit fullscreen mode

Index Maintenance

-- Check index usage
SELECT indexrelname, idx_scan, idx_tup_read
FROM pg_stat_user_indexes
ORDER BY idx_scan ASC;

-- Find unused indexes (candidates for removal)
SELECT indexrelname, idx_scan
FROM pg_stat_user_indexes
WHERE idx_scan = 0 AND indexrelname NOT LIKE '%pkey%';

-- Check index size
SELECT pg_size_pretty(pg_indexes_size('orders'));
Enter fullscreen mode Exit fullscreen mode

Python: Checking Query Performance

import time

async def profile_query(db, query, params=None):
    # Run EXPLAIN ANALYZE
    explain = await db.fetch_one(f"EXPLAIN ANALYZE {query}", params)
    print(explain)

    # Time the actual query
    start = time.perf_counter()
    result = await db.fetch_all(query, params)
    elapsed = time.perf_counter() - start
    print(f"Query returned {len(result)} rows in {elapsed:.3f}s")
    return result
Enter fullscreen mode Exit fullscreen mode

Key Takeaways

  1. Always index columns used in WHERE, JOIN, and ORDER BY
  2. Composite index column order matters — most selective first
  3. Use EXPLAIN ANALYZE to verify index usage
  4. Partial indexes save space and improve performance
  5. Remove unused indexes — they slow down writes

6. Functions on indexed columns prevent index use — use expression indexes

🚀 Level up your AI workflow! Check out my AI Developer Mega Prompt Pack — 80 battle-tested prompts for developers. $9.99

Top comments (0)