Python-T Point

Posted on Jun 7 • Originally published at pythontpoint.in

⚙️ Optimize MySQL queries with indexes for production environments made easy

#mysql #database #sql

🔎 Counterintuitive Truth — Adding an index can make a query slower when the optimizer prefers it over a better plan.

To optimize MySQL queries with indexes for production , you must understand why an index sometimes hurts performance and how the optimizer evaluates index statistics.

📑 Table of Contents

🔎 Counterintuitive Truth — Adding an index can make a query slower when the optimizer prefers it over a better plan.
🔍 Understanding Indexes — Why They Matter
📈 Analyzing Query Plans — How to Read EXPLAIN
🔧 Interpreting the ‘type’ column
🔎 Spotting Missing Indexes
🛠 Index Design Patterns — When to Apply Composite Indexes
🔗 Covering Indexes
📂 Prefix Indexes
⚙️ Maintaining Indexes in Production — How to Monitor and Refresh
🟩 Final Thoughts
❓ Frequently Asked Questions
When should I use a composite index instead of two separate single‑column indexes?
How often should I run ANALYZE on large tables?
What is the impact of indexes on INSERT performance?
📚 References & Further Reading

🔍 Understanding Indexes — Why They Matter

Indexes are B‑tree structures that map column values to row locations. The B‑tree enables logarithmic search (O(log N)) instead of linear (O(N)) scans.

When a query filters on an indexed column, MySQL traverses the B‑tree to the matching leaf nodes, then follows the pointer to the base row. The cost reduction depends on the index’s selectivity—high cardinality (distinct values) yields fewer rows to examine.

CREATE INDEX idx_user_email ON users (email);
SHOW INDEX FROM users;



Table: users
Non_unique: 1
Key_name: idx_user_email
Seq_in_index: 1
Column_name: email
Collation: A
Cardinality: 105432
Sub_part: NULL
Packed: NULL
Null: NO
Index_type: BTREE
Comment:
Index_comment:

In the output, Cardinality approximates the number of distinct values; a high cardinality index is more selective, which improves the optimizer’s cost model.

Key point: A well‑designed index turns a full scan into a logarithmic lookup, but only if the index’s selectivity matches the query’s filter.

📈 Analyzing Query Plans — How to Read EXPLAIN

EXPLAIN shows the optimizer’s chosen access path and cost estimates for each table in the query.

$ mysql -u app -p -e "EXPLAIN SELECT * FROM orders WHERE user_id = 12345 AND status = 'shipped';"



+id+select_type+table+type+possible_keys+key+key_len+ref+rows+Extra+
| 1 | SIMPLE | orders | ref | idx_user_id,idx_status | idx_user_id | 4 | const | 1 | Using where |
+---+-------------+--------+------+-----------------------+------------+------+------+-----+------------+

The type column indicates the join strategy; “ref” means an index range scan. The key column reports which index was used, and rows estimates how many rows MySQL expects to read.

🔧 Interpreting the ‘type’ column

Values rank from most to least efficient: system, const, eq_ref, ref, range, index, ALL. “ALL” triggers a full table scan; the optimizer will avoid it when a suitable index exists.

🔎 Spotting Missing Indexes

If possible_keys lists an index but key is NULL, the optimizer rejected the index—usually because selectivity is insufficient or because a function (e.g., LOWER(col)) prevents its use.

Key point: EXPLAIN is the primary diagnostic tool for optimizing MySQL queries with indexes for production ; it tells you whether an index is actually being used.

🛠 Index Design Patterns — When to Apply Composite Indexes

Composite indexes store multiple columns in a single B‑tree, following the left‑most prefix rule. This allows MySQL to filter on the leading column(s) and then use the remaining columns for additional predicates or ordering without extra lookups.

CREATE INDEX idx_orders_user_status ON orders (user_id, status);
EXPLAIN SELECT * FROM orders WHERE user_id = 12345 AND status = 'shipped' ORDER BY created_at DESC;



+id+select_type+table+type+possible_keys+key+key_len+ref+rows+Extra+
| 1 | SIMPLE | orders | ref | idx_orders_user_status | idx_orders_user_status | 5 | const,const | 1 | Using where; Using filesort |
+---+-------------+--------+------+------------------------+------------------------+-----+-------------+-----+----------------------------+

The key_len field (5 bytes) shows that both user_id and status contribute to the lookup, enabling a single index scan.

🔗 Covering Indexes

A covering index includes every column referenced by the SELECT list, allowing MySQL to satisfy the query from the index leaf nodes alone. Example: (Also read: ⚙️ Monitoring MinIO with Prometheus and Grafana — the right way for production)

CREATE INDEX idx_orders_user_status_created ON orders (user_id, status, created_at);
SELECT user_id, status, created_at FROM orders WHERE user_id = 12345 AND status = 'shipped';

Because the SELECT list is fully covered, the engine avoids reading the base table, reducing I/O and eliminating the “Using filesort” step. (Also read: 💡 MySQL INNER JOIN vs LEFT JOIN — which one should you actually use?)

📂 Prefix Indexes

For long VARCHAR columns, a prefix index (e.g., email(20)) stores only the first N bytes. This shrinks the index size while preserving enough selectivity for typical queries.

CREATE INDEX idx_email_prefix ON users (email(20));

According to the MySQL 8.0 Reference Manual, prefix indexes are stored as the specified number of bytes, which improves cache efficiency for long text columns. (Also read: 🐍 Flask Python Structured Logging — What Most Miss in Production)

Index Type	Structure	Typical Use‑Case	Pros	Cons
BTREE	Balanced tree	Range queries, equality	Ordered, supports prefix	Extra space for leaf pointers
HASH	Hash table (MEMORY engine)	Exact equality	O(1) lookup	No range support, high collision risk
FULLTEXT	Inverted index	Text search	Natural language queries	Only for MyISAM/InnoDB full‑text

Key point: Composite and covering indexes are the most effective tools for optimizing MySQL queries with indexes for production when queries filter on multiple columns or need to avoid extra table lookups. (More onPythonTPoint tutorials)

⚙️ Maintaining Indexes in Production — How to Monitor and Refresh

Regularly updating statistics and pruning unused indexes prevents performance regression as data distribution changes.

$ mysql -u admin -p -e "ANALYZE TABLE orders;"



+Table+status+Msg_type+Msg_text+
| orders | OK | status | Table does not support ANALYZE |
+-------+----+----------+------------------------------+

For InnoDB tables, ANALYZE recomputes index cardinality based on the current row distribution, which the optimizer uses for cost estimates.

$ mysql -u admin -p -e "SELECT COUNT(*) FROM information_schema.STATISTICS WHERE TABLE_SCHEMA='prod' AND INDEX_NAME='idx_unused';"



+COUNT(*)+
| 0 |
+--------+

When an index shows zero usage over a monitoring interval, drop it to reduce write overhead.

DROP INDEX idx_unused ON orders;

Removing unused indexes frees disk space and speeds up INSERT/UPDATE operations because fewer index trees need to be maintained.

Key point: Continuous monitoring, periodic ANALYZE, and selective removal keep the index set optimal for production workloads.

🟩 Final Thoughts

Effective index management is a cycle of measurement, analysis, and refinement. The optimizer’s decisions rely on up‑to‑date statistics; stale statistics can render even a perfectly chosen index ineffective.

For production environments, automate the collection of EXPLAIN output, schedule regular ANALYZE, and integrate index‑usage metrics into your observability stack. This systematic approach ensures each index continues to deliver its intended performance benefit.

❓ Frequently Asked Questions

When should I use a composite index instead of two separate single‑column indexes?

A composite index is beneficial when queries filter on the leading column(s) and also need ordering or covering for the subsequent columns. The optimizer can satisfy both predicates in a single index scan, whereas separate indexes would require additional row lookups.

How often should I run ANALYZE on large tables?

Run ANALYZE after any bulk data load, after a major schema change, or on a schedule that matches data volatility—typically daily for high‑write tables and weekly for relatively static tables.

What is the impact of indexes on INSERT performance?

Each INSERT must update every index that references the inserted columns. The cost scales with the number of indexes and their size; unnecessary indexes increase write latency and I/O.

📚 References & Further Reading

MySQL 8.0 Reference Manual — index design and optimizer behavior: dev.mysql.com
MySQL 8.0 Reference Manual — EXPLAIN statement details: dev.mysql.com
MySQL 8.0 Reference Manual — ANALYZE TABLE syntax and effects: dev.mysql.com

DEV Community