Gowtham Potureddi

Posted on May 23

MySQL Interview Questions & Answers: Top Patterns for Data Engineers

#python #sql #interview #dataengineering

mysql interview questions circle around six themes every loop tests: storage engines (InnoDB vs MyISAM), indexing strategy (B-tree, composite, covering, prefix), query optimization with EXPLAIN, MySQL-specific syntax (LIMIT … OFFSET, IFNULL, GROUP_CONCAT, ON DUPLICATE KEY UPDATE), transactional semantics (InnoDB locks, isolation levels, gap locks), and operational realism (replication, partitioning, JSON columns since 5.7). Whether you're prepping for mysql interview questions and answers at a startup or grinding mysql interview questions for data engineer at FAANG, the same patterns show up — and the mysql interview questions and answers for experienced roster adds replication topology, performance tuning, and the InnoDB lock manager on top.

This guide walks through every theme in the mysql interview questions and answers pdf ecosystem that reviewers love to test in data engineering interview questions: the InnoDB vs MyISAM storage-engine comparison, the mysql indexing strategy menu (B-tree, composite, prefix, covering), EXPLAIN plan reading for query optimization, MySQL-specific operators (LIMIT, IFNULL, GROUP_CONCAT, ON DUPLICATE KEY UPDATE), the mysql transactions and lock-manager model, JSON-column patterns since MySQL 5.7+, and replication / partitioning patterns that come up in senior loops. Every section ends as sql interview questions with answers: a runnable MySQL query, a traced execution, an output table, and a concept-by-concept why this works breakdown — the exact shape sql for data engineers rounds reward when MySQL comes up.

When you want hands-on reps immediately after reading, browse SQL practice library →, drill the database problems →, sharpen indexing SQL drills →, rehearse aggregation SQL drills →, or widen coverage on the full SQL practice library →.

On this page

Why MySQL interview questions test the same six themes
InnoDB vs MyISAM — storage engines that reviewers compare
MySQL indexing strategy — B-tree, composite, covering, prefix
Query optimization with EXPLAIN — reading the MySQL plan
MySQL-specific syntax — LIMIT, IFNULL, GROUP_CONCAT, ON DUPLICATE KEY
Transactions and the InnoDB lock manager
JSON columns, partitioning, and replication — senior-level themes
Choosing the right MySQL pattern (cheat sheet)
Frequently asked questions
Practice on PipeCode

1. Why MySQL interview questions test the same six themes

MySQL interviews stay focused — and the six-theme map covers 90% of the prompts

The one-sentence invariant: mysql interview questions recycle the same six themes (storage engines, indexing, query optimization, MySQL-specific syntax, transactions, ops realism) regardless of seniority — the mysql interview questions for freshers deck cuts the depth in each but covers the same surface, while mysql interview questions and answers for experienced adds replication, partitioning, and performance tuning on top. Once you know the six themes, every prompt becomes "which theme is the reviewer probing?"

The six themes at a glance.

Storage engines — InnoDB (transactional, default) vs MyISAM (legacy, non-transactional) vs MEMORY (temp tables); the innodb vs myisam comparison is the #1 most-asked.
Indexing strategy — B-tree (default), composite (multi-column), covering (all SELECT columns in index), prefix (first N chars of VARCHAR); MySQL's index byte-budget makes prefix indexes a frequent topic.
Query optimization with EXPLAIN — reading the plan, understanding access methods (const, eq_ref, ref, range, index, ALL), spotting full-table scans.
MySQL-specific syntax — LIMIT N OFFSET M, IFNULL, GROUP_CONCAT, ON DUPLICATE KEY UPDATE, STR_TO_DATE, DATE_FORMAT (see Blog74).
Transactions and locks — isolation levels, gap locks, deadlocks, FOR UPDATE / LOCK IN SHARE MODE.
Operational topics — replication topology, read replicas, partitioning, JSON columns since 5.7, the 8.0 window-function rollout.

What interviewers listen for.

Do you name InnoDB as the default since MySQL 5.5 and explain why MyISAM is rarely the right choice now? — basic-but-tested.
Do you reach for EXPLAIN before guessing when asked to diagnose a slow query? — senior signal.
Do you mention gap locks when the conversation turns to deadlocks in REPEATABLE READ (MySQL's default isolation level)? — bonus points.
Do you know that MySQL added window functions in 8.0 and CTEs in 8.0 too? — version awareness.

The mysql interview questions and answers archetype.

Junior / fresher prompt — "What's the difference between WHERE and HAVING?" (see Blog73), "What are the SQL data types in MySQL?", "What is a primary key?".
Mid-level prompt — "Explain ON DUPLICATE KEY UPDATE", "When would you use MyISAM over InnoDB?", "Show me a GROUP_CONCAT query".
Senior prompt — "Diagnose this slow query using EXPLAIN", "Explain MySQL's gap-lock behaviour at REPEATABLE READ", "How would you set up read replicas and handle replication lag?".

Worked example — a single MySQL query that hits four of the six themes

Detailed explanation. Real interview prompts often combine themes — indexing strategy, MySQL-specific syntax, query optimization, and transactions all in one shape. The query below ingests new orders idempotently using ON DUPLICATE KEY UPDATE, formats the output with IFNULL and GROUP_CONCAT, and would benefit from a composite index on (customer_id, order_date).

Question. From orders(order_id PRIMARY KEY, customer_id, order_date, amount, status) and customers(customer_id PRIMARY KEY, name), write a MySQL query that returns per customer the comma-separated list of their order IDs from the last 30 days, total amount, and the literal 'no orders' when they have none.

Input. orders and customers slices.

order_id	customer_id	order_date	amount	status
1	42	2026-05-20	100	paid
2	42	2026-05-22	150	paid
3	17	2026-05-21	80	paid

customer_id	name
42	Alice
17	Bob
88	Carol

Code (MySQL).

SELECT c.customer_id,
       c.name,
       IFNULL(GROUP_CONCAT(o.order_id ORDER BY o.order_date), 'no orders') AS recent_orders,
       IFNULL(SUM(o.amount), 0) AS total_amount
FROM customers c
LEFT JOIN orders o
       ON o.customer_id = c.customer_id
      AND o.order_date >= CURDATE() - INTERVAL 30 DAY
GROUP BY c.customer_id, c.name
ORDER BY total_amount DESC;

Step-by-step explanation.

LEFT JOIN orders keeps every customer; matching is filtered by date in the ON clause (so customers with no recent orders still appear).
GROUP_CONCAT(o.order_id ORDER BY o.order_date) builds a comma-separated list of order IDs per customer.
IFNULL(GROUP_CONCAT(…), 'no orders') returns 'no orders' when the LEFT JOIN produced no matches.
IFNULL(SUM(o.amount), 0) returns 0 instead of NULL for customers with no orders.
CURDATE() - INTERVAL 30 DAY is the MySQL idiom for "30 days ago" (see Blog74 §7).
GROUP BY c.customer_id, c.name collapses to one row per customer.

Output.

customer_id	name	recent_orders	total_amount
42	Alice	1,2	250
17	Bob	3	80
88	Carol	no orders	0

Rule of thumb: MySQL interview questions reward fluency with IFNULL, GROUP_CONCAT, INTERVAL arithmetic, and ON DUPLICATE KEY UPDATE — the four MySQL-specific operators reviewers test most.

SQL
Topic — sql
SQL practice library

Practice →

SQL
Topic — database
Database SQL problems

Practice →

2. InnoDB vs MyISAM — storage engines that reviewers compare

`innodb vs myisam` — the #1 most-asked MySQL interview question

The innodb vs myisam comparison is the single most-asked mysql interview questions and answers prompt. The short answer is InnoDB for every new table since MySQL 5.5; the long answer surfaces every guarantee that distinguishes a modern transactional engine from a legacy non-transactional one.

InnoDB — the default since MySQL 5.5.

ACID-compliant — full transaction support (BEGIN / COMMIT / ROLLBACK); see Blog79 §6.
Row-level locking — concurrent writers don't block each other unless they touch the same row.
Foreign keys — FOREIGN KEY constraints fully enforced (the only MySQL engine that supports them).
MVCC (Multi-Version Concurrency Control) — readers don't block writers; writers don't block readers.
Crash recovery — write-ahead log (redo log) + double-write buffer; recovers cleanly after crashes.
Clustered primary key — table data physically organised by the primary key; secondary indexes point to the PK, not a rowid.

MyISAM — legacy, mostly removed in modern stacks.

No transactions — BEGIN / COMMIT are no-ops; every write is auto-committed.
Table-level locking — concurrent writers block each other; concurrent reads + a single writer can coexist but writes serialize.
No foreign keys — FOREIGN KEY constraints silently ignored.
Faster full-table COUNT(*) — historically because MyISAM stored the row count; modern InnoDB also tracks this approximately.
Faster bulk loads — historically; the gap has narrowed substantially with modern InnoDB.
Use cases that survive — read-only / read-mostly tables with no transactional needs; legacy systems where migration is risky.

The innodb vs myisam decision table.

Capability	InnoDB	MyISAM
Transactions	Yes (full ACID)	No
Row-level locks	Yes	No (table-level)
Foreign keys	Yes	No
Crash recovery	Yes (redo log + double-write)	Minimal
Concurrent writes	Yes (MVCC)	Block on table lock
Clustered PK	Yes	No (heap)
Full-text search	Yes (5.6+)	Yes (historically faster)
Storage overhead	Higher	Lower
Best for	OLTP, transactional, every new table	Read-only legacy tables only

MEMORY and Archive storage engines — the niche players.

MEMORY — table lives in RAM; lost on restart; useful for session tables and temporary lookups.
Archive — compressed, append-only; for log / audit data that's rarely queried.
NDB / MySQL Cluster — distributed; rare outside specific architectures.

When MyISAM is the right call (rarely).

Read-only or read-mostly lookups where transactional guarantees are not needed.
Disk-space constrained environments where InnoDB's overhead matters.
Legacy schemas where migration to InnoDB requires application-layer changes.
Modern interview answer — "I'd default to InnoDB for every new table; I'd only keep MyISAM for legacy schemas during migration windows."

SQL
Topic — database
Storage engine SQL drills

Practice →

SQL
Topic — sql
MySQL SQL library

Practice →

3. MySQL indexing strategy — B-tree, composite, covering, prefix

`mysql indexing` — four index shapes that cover every common query pattern

mysql indexing is the second-most-asked mysql interview questions theme. The interview-canonical pattern: name the four shapes (B-tree, composite, covering, prefix), explain when to use each, and reach for EXPLAIN to confirm the planner picked the right one.

Shape 1 — single-column B-tree.

CREATE INDEX idx_orders_created ON orders (created_at) — the bread and butter.
Use case — single-column range or equality predicates (WHERE created_at > '…').
Cost — Θ(log n) lookup; index updates on every INSERT / UPDATE / DELETE of the column.

Shape 2 — composite (multi-column).

CREATE INDEX idx_orders_cust_created ON orders (customer_id, created_at) — two columns.
Leftmost-prefix rule — the index accelerates WHERE customer_id = ?, WHERE customer_id = ? AND created_at > ?, but NOT WHERE created_at > ? alone.
Order matters — put the equality column first, range column second.
mysql interview questions for data engineer loops love this rule.

Shape 3 — covering index (the index-only scan).

An index is "covering" when it contains every column a query needs — both filter and project.
CREATE INDEX idx_orders_cust_amt ON orders (customer_id, amount) — for SELECT amount FROM orders WHERE customer_id = ?.
The planner does an "index-only scan" — skips the heap fetch entirely; can be 10-100× faster.
In MySQL — appears in EXPLAIN as Using index (without Using where; Using index).

Shape 4 — prefix index (MySQL-specific).

CREATE INDEX idx_email_prefix ON users (email(20)) — index the first 20 characters of email.
Why — MySQL has a 767-byte index limit (or 3072 with innodb_large_prefix); long VARCHAR columns force prefix indexing.
Trade-off — uniqueness within the prefix isn't guaranteed; the planner does an extra disk read for full-value comparison.
Common with VARCHAR(255) columns — the canonical MySQL pattern for long-string columns.

mysql index types beyond B-tree.

FULLTEXT — text-search index; MATCH(col) AGAINST('query' IN BOOLEAN MODE).
SPATIAL — for GEOMETRY columns; less common.
HASH — only supported by the MEMORY engine; not InnoDB.

The mysql interview questions and answers for experienced indexing checklist.

Every FK column should be indexed — InnoDB doesn't auto-index FKs (Postgres does); add it manually.
The PK is auto-indexed as a clustered index on InnoDB; secondary indexes store the PK at the leaf.
Sub-query / join columns — index both sides of a join.
LIKE 'prefix%' uses an index; LIKE '%suffix' does not.
Function-wrapped columns defeat the index — WHERE YEAR(d) = 2026 won't use INDEX (d) unless you create a functional index (MySQL 8.0+) or rewrite as WHERE d >= '2026-01-01' AND d < '2027-01-01'.

Common indexing pitfalls.

Over-indexing — every extra index slows down writes by ~5-10%; profile first.
Wrong column order in composite index — equality column second instead of first.
Implicit type coercion — WHERE int_col = '42' may NOT use the int index. See Blog78.
Missing index on the JOIN column — visible in EXPLAIN as ALL (full table scan) instead of ref.

SQL
Topic — indexing
MySQL indexing drills

Practice →

SQL
Topic — database
Index + query patterns

Practice →

4. Query optimization with EXPLAIN — reading the MySQL plan

`mysql explain` — read the access type and you've found the bottleneck

EXPLAIN (or EXPLAIN ANALYZE in 8.0+) is the single most-asked debugging tool in MySQL interviews. The senior signal: knowing the meaning of each type column value and recognising which ones indicate a full-table scan.

The EXPLAIN columns you must know.

id — the query block identifier; one per SELECT / subquery.
select_type — SIMPLE, PRIMARY, SUBQUERY, DERIVED, UNION — the shape of the SELECT.
table — which table this row of the plan refers to.
type — the access method; the most important column (see below).
possible_keys — indexes the planner could use.
key — the index the planner chose (or NULL).
key_len — bytes of the index actually used (low = partial use).
ref — what the index is matched against (constant or column).
rows — estimated rows examined; the smaller the better.
Extra — additional notes; Using index (covering scan), Using filesort (extra sort step), Using temporary (temp table created).

The type column — best to worst.

const — the planner knows it's a single row by PK or unique key lookup; the cheapest.
eq_ref — equality reference via PK / unique key; one row per row of the joined-from table.
ref — equality reference via non-unique index; multiple rows possible.
range — index range scan (e.g. WHERE d BETWEEN x AND y).
index — full index scan (cheaper than ALL because indexes are smaller); avoid if possible.
ALL — full table scan; the worst; usually a missing-index bug.

Reading a slow EXPLAIN — the senior checklist.

type = ALL with rows = 100000+ — missing index on the predicate column. Add one.
Using filesort — the query needs an extra sort step; add an index ordered to match ORDER BY.
Using temporary — the planner is materializing an intermediate result; common with DISTINCT / GROUP BY; consider an index that supports the grouping.
key_len very small — the composite index isn't being fully used; check column order and the WHERE clause.

EXPLAIN ANALYZE (MySQL 8.0+) — runs the query and shows actual timing.

Output — node-by-node breakdown with actual time, actual rows, loops.
Use case — diagnosing the gap between estimated rows and actual rows (a sign of stale statistics).
Run ANALYZE TABLE t — refreshes statistics.

Common EXPLAIN gotchas.

possible_keys is NULL — no index covers the predicate; add one.
key is NULL but possible_keys is not — the planner thinks a table scan is cheaper (small table) or stats are stale.
rows is much larger than actual data size — stale statistics; run ANALYZE TABLE.
Extra includes Using where — the engine is filtering after the index; sometimes unavoidable.

MySQL interview question — diagnose and fix a slow query with EXPLAIN

Assume orders(order_id PRIMARY KEY, customer_id, order_date, amount) with 10M rows, no index on (customer_id, order_date). The query SELECT * FROM orders WHERE customer_id = 42 AND order_date >= '2026-01-01' is slow.

Solution Using composite index + EXPLAIN verification

Code (MySQL).

-- Diagnose
EXPLAIN
SELECT * FROM orders
WHERE customer_id = 42
  AND order_date >= '2026-01-01';

-- Likely shows type = ALL or type = ref with high rows estimate

-- Fix
CREATE INDEX idx_orders_cust_date
    ON orders (customer_id, order_date);

-- Verify
EXPLAIN
SELECT * FROM orders
WHERE customer_id = 42
  AND order_date >= '2026-01-01';

-- Now shows type = range, key = idx_orders_cust_date, rows ≈ small

Step-by-step trace.

step	EXPLAIN cell	before → after
1	`type`	`ALL` → `range`
2	`key`	`NULL` → `idx_orders_cust_date`
3	`rows`	`10000000` → `~1000`
4	`Extra`	`Using where` → `Using index condition`

Why this works — concept by concept:

Composite index (customer_id, order_date) — leftmost prefix matches the equality on customer_id; the second column lets the range on order_date be index-driven.
Column order matters — equality first (customer_id), range second (order_date); reverse would not allow the range to use the index efficiently.
EXPLAIN access-type goes from ALL → range — the rowset shrinks from 10M to ~1k, the query becomes index-scan + heap-fetch.
Using index condition pushdown — MySQL 5.6+ pushes the predicate into the storage-engine layer; further reduces work.
Cost — Θ(log n) to find the leftmost entry + Θ(matched_rows) for the range scan.

SQL
Topic — indexing
EXPLAIN / indexing drills

Practice →

SQL
Topic — database
Query optimization patterns

Practice →

5. MySQL-specific syntax — LIMIT, IFNULL, GROUP_CONCAT, ON DUPLICATE KEY

`mysql syntax` — the operators that mark out a MySQL-fluent candidate

A handful of MySQL-specific operators show up in every mysql interview questions and answers roster. Knowing them by name and by use case is the difference between a candidate who claims MySQL experience and one who has actually shipped MySQL code.

LIMIT N [OFFSET M] — MySQL's pagination operator.

Signature — SELECT … LIMIT 10 (first 10), SELECT … LIMIT 10 OFFSET 20 (rows 21-30), shorthand LIMIT 20, 10.
Use case — pagination, top-N queries.
The deep-pagination trap — LIMIT 100000, 10 scans 100,010 rows; rewrite as WHERE id > <last_seen> LIMIT 10.
mysql limit offset vs SQL Server OFFSET FETCH — the same idea in different dialects.

IFNULL(expr, fallback) — MySQL's null-coalescing.

Signature — IFNULL(<expr>, <fallback>) returns <fallback> when <expr> is NULL.
ANSI equivalent — COALESCE(expr, fallback) (Postgres / standard); both work in MySQL.
Use case — IFNULL(SUM(amount), 0) to return 0 instead of NULL for empty groups (Blog72 lesson).

GROUP_CONCAT(expr [ORDER BY …] [SEPARATOR sep]) — string aggregate.

Signature — GROUP_CONCAT(col ORDER BY col SEPARATOR ', ').
MySQL's equivalent of STRING_AGG (Postgres / SQL Server) and LISTAGG (Oracle / Snowflake).
Length limit — group_concat_max_len (default 1024); raise it for long aggregations.
Use case — "comma-separated list of orders per customer" reports.

ON DUPLICATE KEY UPDATE — MySQL's UPSERT.

Signature — INSERT INTO t (cols) VALUES (…) ON DUPLICATE KEY UPDATE col = VALUES(col) (legacy) or … AS new ON DUPLICATE KEY UPDATE col = new.col (8.0.20+).
Trigger — any UNIQUE / PRIMARY KEY violation.
The MySQL idempotent-ingest workhorse — see Blog79 §5.
Equivalent on other dialects — INSERT … ON CONFLICT DO UPDATE (Postgres), MERGE INTO (SQL Server / Oracle / Snowflake).

Date and string operators — the MySQL flavour.

CURDATE() / CURTIME() / NOW() — current date / time / timestamp.
DATE_FORMAT(d, '%Y-%m-%d') — date-to-string (see Blog74 §4).
STR_TO_DATE(s, '%d/%m/%Y') — string-to-date.
DATE_ADD(d, INTERVAL 7 DAY) / DATE_SUB(d, INTERVAL 30 DAY) — date arithmetic (Blog74 §7).
CONCAT(a, b, c) / CONCAT_WS(sep, a, b) — string concatenation.
SUBSTRING_INDEX(s, delim, n) — split-and-pick a delimited substring.

IF(cond, true_val, false_val) — MySQL's ternary.

Equivalent to CASE WHEN cond THEN true_val ELSE false_val END; MySQL-specific shorthand.
IFNULL is the NULL-specific subset.
CASE WHEN is more portable — use it in cross-engine codebases.

Common MySQL pitfalls.

Pre-8.0 lacks window functions — ROW_NUMBER(), RANK(), LAG() only in 8.0+; older codebases simulate with user variables (@row := @row + 1).
Pre-8.0 lacks CTEs — WITH cte AS (…) since 8.0; older code uses derived tables.
GROUP BY non-strict mode — pre-MySQL 5.7.5 silently allowed selecting non-grouped columns; modern ONLY_FULL_GROUP_BY enforces ANSI (see Blog73 §2).
ENUM and SET types — MySQL-specific; portability nightmare; prefer VARCHAR with a CHECK constraint.

SQL
Topic — sql
MySQL syntax drills

Practice →

SQL
Topic — aggregation
GROUP_CONCAT + aggregation

Practice →

6. Transactions and the InnoDB lock manager

`mysql transactions` — REPEATABLE READ is the default, and gap locks are real

mysql transactions follow the same BEGIN / COMMIT / ROLLBACK shape as every other engine (Blog79 §6), but the default isolation level differs from most other databases — MySQL InnoDB defaults to REPEATABLE READ, while PostgreSQL, SQL Server, and Oracle default to READ COMMITTED. This single-line difference is the source of more "why does this deadlock?" interview questions than any other MySQL quirk.

The four isolation levels in MySQL.

READ UNCOMMITTED — sees uncommitted writes from others ("dirty reads"); rarely used.
READ COMMITTED — sees only committed writes; matches Postgres / SQL Server / Oracle defaults.
REPEATABLE READ — MySQL InnoDB default; same query inside the transaction always returns the same rows.
SERIALIZABLE — strongest; every read becomes a SELECT … LOCK IN SHARE MODE.

REPEATABLE READ + gap locks — the senior-signal interview topic.

Gap locks — InnoDB at REPEATABLE READ locks the gap between index entries that match the predicate, not just the matching rows.
Purpose — prevent phantom reads (new rows appearing in a re-query within the same transaction).
Cost — concurrent inserts that would land in the locked gap block until the transaction commits.
The classic deadlock — two transactions each lock different gaps, then try to insert into each other's gap.
Fix — lower the isolation level to READ COMMITTED (no gap locks) when phantom reads aren't a concern.

SELECT … FOR UPDATE and LOCK IN SHARE MODE.

SELECT … FOR UPDATE — acquires an exclusive row lock; blocks other readers and writers until commit.
SELECT … LOCK IN SHARE MODE — acquires a shared row lock; allows other readers, blocks writers.
Use case — read-modify-write patterns; row reservation for processing.

InnoDB lock granularity.

Row-level (the default) — only the affected row is locked.
Gap lock — at REPEATABLE READ, also the gap before the row.
Next-key lock — row lock + gap lock combined; the default for range predicates.
Table-level — escalated for DDL (ALTER TABLE) and certain admin operations.

Deadlock detection.

InnoDB has a deadlock detector — when two transactions wait for each other in a cycle, one is automatically rolled back with ERROR 1213: Deadlock found.
The application must retry — deadlocks are normal; production code wraps DML in a retry loop with exponential backoff.
Reduce deadlock frequency by accessing rows in a consistent order across all code paths.

autocommit — the silent default.

MySQL has autocommit = 1 by default — every statement is its own transaction.
Disable — SET autocommit = 0 or START TRANSACTION explicitly.
Common bug — running multiple statements expecting them to be one transaction without START TRANSACTION (or BEGIN).

Transactions on DDL — the implicit-commit trap.

DDL statements (CREATE, ALTER, DROP, TRUNCATE) issue an implicit commit before and after — wrapping them in START TRANSACTION doesn't make them rollback-able.
TRUNCATE cannot be rolled back in MySQL InnoDB.
The fix — back up before destructive DDL; use DELETE if rollback safety is required.

SQL
Topic — database
MySQL transactions drills

Practice →

SQL
Topic — data-manipulation
Transactional CRUD patterns

Practice →

7. JSON columns, partitioning, and replication — senior-level themes

`mysql interview questions and answers for experienced` — three themes that mark senior loops

Senior MySQL loops add three themes on top of the basics: JSON column patterns since MySQL 5.7, partitioning for very large tables, and replication topology (read replicas, source-replica, GTIDs, lag).

JSON columns since MySQL 5.7.

JSON type — stores JSON documents; engine validates the syntax on write.
-> and ->> operators — path extraction; -> returns JSON, ->> returns text.
JSON_EXTRACT(col, '$.path') — equivalent function form.
Indexing — create generated columns from JSON paths and index those: created_at_gen TIMESTAMP AS (col->>'$.created_at') STORED, INDEX (created_at_gen).
vs PostgreSQL JSONB — Postgres JSONB is indexed natively (GIN); MySQL JSON requires generated columns.

Partitioning — for very large fact tables.

PARTITION BY RANGE (col) — split a single logical table into multiple physical partitions by a column's range; e.g. one partition per month of created_at.
PARTITION BY LIST (col) — explicit value list per partition.
PARTITION BY HASH (col) / KEY (col) — hash-based partitioning for even distribution.
Use case — pruning during queries (WHERE created_at >= '2026-05-01' only reads the May partition), faster DROP PARTITION for retention.
Constraint — partitioning columns must be part of every UNIQUE / PRIMARY KEY index.

Replication — async source-replica.

Statement-based, row-based, or mixed — binlog_format setting; row-based is the modern default.
GTIDs (Global Transaction IDs) — since MySQL 5.6; uniquely identify each transaction across the cluster.
Read replicas — async; the lag is the source of "I just wrote this row, why isn't it visible?" bugs.
Failover — promote a replica to source; needs careful coordination.

Replication lag — the silent reader bug.

The bug — application writes to source, immediately reads from replica, replica hasn't caught up.
Symptom — "I just inserted this row, why don't I see it?".
Fix — read from source for the user that just wrote (read-your-writes); or wait until the replica catches up (SHOW SLAVE STATUS / SHOW REPLICA STATUS).

MySQL 8.0 — the modern feature set.

Window functions — ROW_NUMBER(), RANK(), LAG(), LEAD(); finally arrived in 8.0 (see Blog70).
CTEs — WITH cte AS (…) and recursive CTEs.
Invisible indexes — ALTER TABLE t ALTER INDEX idx INVISIBLE lets you test dropping an index without actually dropping it.
JSON path schema validation — CHECK (JSON_SCHEMA_VALID('{…}', col)).
SET PERSIST — dynamic config changes that survive restart.

Performance tuning — the canonical menu.

innodb_buffer_pool_size — the InnoDB cache; set to ~70% of system RAM on dedicated DB servers.
innodb_log_file_size — redo log size; larger = fewer checkpoints, faster writes, slower recovery.
query_cache — removed in MySQL 8.0; was a scalability bottleneck.
slow_query_log — log queries above a threshold; the first stop for "why is the app slow?".

SQL
Topic — database
Senior MySQL drills

Practice →

SQL
Topic — etl
ETL + replication patterns

Practice →

Choosing the right MySQL pattern (cheat sheet)

A one-screen cheat sheet for the most-asked mysql interview questions and answers patterns.

You want to …	MySQL pattern	Notes
Pick a storage engine	`InnoDB`	Default since 5.5; transactional; row-level locks; FK support
Add a composite index	`CREATE INDEX idx ON t (a, b)`	Equality first, range second
Index a long VARCHAR	`INDEX (col(20))` (prefix index)	Within the 767-byte / 3072-byte limit
Diagnose a slow query	`EXPLAIN SELECT …`	Check `type`, `key`, `rows`, `Extra`
Paginate results	`LIMIT 10 OFFSET 20`	Beware deep-offset cost; use keyset pagination
Replace NULL with a fallback	`IFNULL(col, 0)`	Or `COALESCE(col, 0)` for portability
Comma-join values per group	`GROUP_CONCAT(col ORDER BY col SEPARATOR ', ')`	Raise `group_concat_max_len` if needed
Idempotent insert (upsert)	`INSERT … ON DUPLICATE KEY UPDATE …`	Or `INSERT IGNORE` for skip-on-conflict
Add 7 days to a date	`DATE_ADD(d, INTERVAL 7 DAY)`	Or `d + INTERVAL 7 DAY`
Format a date	`DATE_FORMAT(d, '%Y-%m-%d')`	%Y year, %m month, %d day
Parse a date string	`STR_TO_DATE(s, '%d/%m/%Y')`	Pattern matches the input
Lock a row for update	`SELECT … FROM t WHERE id = ? FOR UPDATE`	Inside a transaction
Avoid gap locks	`SET TRANSACTION ISOLATION LEVEL READ COMMITTED`	Drop from default `REPEATABLE READ`
JSON path extract	`col->>'$.field'`	`->>` returns text; `->` returns JSON

Frequently asked questions

What's the difference between InnoDB and MyISAM in MySQL?

InnoDB (the default since MySQL 5.5) is a fully transactional, ACID-compliant engine with row-level locking, foreign-key enforcement, MVCC, and crash recovery via a write-ahead log. MyISAM is a legacy non-transactional engine with table-level locking, no foreign-key support, and minimal crash recovery — it was the default before 5.5 and persists in legacy schemas. The canonical interview answer to innodb vs myisam: choose InnoDB for every new table; only keep MyISAM for legacy read-only / read-mostly schemas where migration is risky or for tables that genuinely don't need transactions. Modern MySQL (8.0) deprecates MyISAM for system tables and continues to invest only in InnoDB.

How do I write an UPSERT in MySQL?

MySQL uses INSERT … ON DUPLICATE KEY UPDATE. The legacy syntax is INSERT INTO t (col1, col2) VALUES (v1, v2) ON DUPLICATE KEY UPDATE col2 = VALUES(col2); since MySQL 8.0.20 the modern form is INSERT INTO t (col1, col2) VALUES (v1, v2) AS new ON DUPLICATE KEY UPDATE col2 = new.col2. The clause fires whenever any UNIQUE or PRIMARY KEY constraint would be violated, performing the UPDATE instead of erroring. For "insert if new, skip if exists" use INSERT IGNORE (silent skip on any constraint violation, including errors you might not want to ignore — use with care). The portable cross-dialect equivalents are INSERT … ON CONFLICT DO UPDATE (PostgreSQL) and MERGE INTO (SQL Server, Oracle, Snowflake) — see Blog79 §5 for the full upsert comparison.

How do I optimize a slow MySQL query?

Start with EXPLAIN SELECT … — read the type, key, rows, and Extra columns. The most-common fixes: (1) type = ALL means a full table scan — add an index on the predicate column; (2) Using filesort means an extra sort step — add an index ordered to match the ORDER BY; (3) Using temporary means the planner materializes an intermediate result — usually for GROUP BY / DISTINCT; consider an index that supports the grouping; (4) rows much larger than the actual result size — stale statistics; run ANALYZE TABLE t. For composite-index queries, confirm the column order matches the leftmost-prefix rule: equality first, range second. Use EXPLAIN ANALYZE (8.0+) to see actual vs estimated rows and node-by-node timing.

What's the difference between WHERE and HAVING in MySQL?

This is one of the most-asked mysql interview questions of all. WHERE filters rows before GROUP BY collapses them — it cannot reference aggregates (SUM, COUNT, AVG) because they don't exist yet at that stage of execution. HAVING filters groups after GROUP BY — it's the only place you can predicate on aggregates of the current grouping. The canonical pattern is WHERE for row-level filters (cheap pushdown, indexes usable) and HAVING for aggregate filters (HAVING SUM(amount) > 1000). MySQL's lenient GROUP BY semantics pre-ONLY_FULL_GROUP_BY allowed non-aggregate columns in SELECT even when not in GROUP BY, but modern MySQL (5.7.5+) enforces the ANSI rule — see Blog73 §4 for the full breakdown.

What's MySQL's default isolation level and why does it matter?

MySQL InnoDB's default transaction isolation level is REPEATABLE READ — different from PostgreSQL, SQL Server, and Oracle, which default to READ COMMITTED. At REPEATABLE READ, the same query inside a transaction always returns the same rows (no phantom reads), achieved via gap locks — InnoDB locks not just the matching rows but also the gaps between index entries. The trade-off: stronger consistency, but more concurrent-write deadlocks because gap locks block inserts into the locked range. For workloads where phantom reads aren't a concern (most analytics-style apps), lower the isolation to READ COMMITTED (SET TRANSACTION ISOLATION LEVEL READ COMMITTED) — gap locks disappear, concurrent inserts run free, deadlock frequency drops. The choice between REPEATABLE READ and READ COMMITTED is a senior mysql interview questions and answers for experienced topic that interviewers love to probe.

Practice on PipeCode

PipeCode ships 450+ data-engineering interview problems — including MySQL-flavoured SQL practice keyed to InnoDB indexing, EXPLAIN-based query optimization, LIMIT / IFNULL / GROUP_CONCAT / ON DUPLICATE KEY UPDATE, transactional patterns, and the dialect quirks that fail candidates who only know one engine. Whether you're drilling mysql interview questions for freshers or grinding mysql interview questions and answers for experienced, the practice library mirrors the same six-theme mental model this guide teaches.

Kick off via Explore practice →; drill the SQL practice lane →; fan out into the database lane →; reinforce indexing SQL drills →; rehearse aggregation SQL drills →; widen coverage on the full SQL practice library →.

DEV Community

MySQL Interview Questions & Answers: Top Patterns for Data Engineers

1. Why MySQL interview questions test the same six themes

MySQL interviews stay focused — and the six-theme map covers 90% of the prompts

Worked example — a single MySQL query that hits four of the six themes

2. InnoDB vs MyISAM — storage engines that reviewers compare

`innodb vs myisam` — the #1 most-asked MySQL interview question

3. MySQL indexing strategy — B-tree, composite, covering, prefix

`mysql indexing` — four index shapes that cover every common query pattern

4. Query optimization with EXPLAIN — reading the MySQL plan

`mysql explain` — read the access type and you've found the bottleneck

MySQL interview question — diagnose and fix a slow query with EXPLAIN

Solution Using composite index + EXPLAIN verification

5. MySQL-specific syntax — LIMIT, IFNULL, GROUP_CONCAT, ON DUPLICATE KEY

`mysql syntax` — the operators that mark out a MySQL-fluent candidate

6. Transactions and the InnoDB lock manager

`mysql transactions` — REPEATABLE READ is the default, and gap locks are real

7. JSON columns, partitioning, and replication — senior-level themes

`mysql interview questions and answers for experienced` — three themes that mark senior loops

Choosing the right MySQL pattern (cheat sheet)

Frequently asked questions

What's the difference between InnoDB and MyISAM in MySQL?

How do I write an UPSERT in MySQL?

How do I optimize a slow MySQL query?

What's the difference between WHERE and HAVING in MySQL?

What's MySQL's default isolation level and why does it matter?

Practice on PipeCode

Top comments (0)

1. Why MySQL interview questions test the same six themes

MySQL interviews stay focused — and the six-theme map covers 90% of the prompts

Worked example — a single MySQL query that hits four of the six themes

2. InnoDB vs MyISAM — storage engines that reviewers compare

innodb vs myisam — the #1 most-asked MySQL interview question

3. MySQL indexing strategy — B-tree, composite, covering, prefix

mysql indexing — four index shapes that cover every common query pattern

4. Query optimization with EXPLAIN — reading the MySQL plan

mysql explain — read the access type and you've found the bottleneck

MySQL interview question — diagnose and fix a slow query with EXPLAIN

Solution Using composite index + EXPLAIN verification

5. MySQL-specific syntax — LIMIT, IFNULL, GROUP_CONCAT, ON DUPLICATE KEY

mysql syntax — the operators that mark out a MySQL-fluent candidate

6. Transactions and the InnoDB lock manager

mysql transactions — REPEATABLE READ is the default, and gap locks are real

7. JSON columns, partitioning, and replication — senior-level themes

mysql interview questions and answers for experienced — three themes that mark senior loops

Choosing the right MySQL pattern (cheat sheet)

Frequently asked questions

What's the difference between InnoDB and MyISAM in MySQL?

How do I write an UPSERT in MySQL?

How do I optimize a slow MySQL query?

What's the difference between WHERE and HAVING in MySQL?

What's MySQL's default isolation level and why does it matter?

Practice on PipeCode

`innodb vs myisam` — the #1 most-asked MySQL interview question

`mysql indexing` — four index shapes that cover every common query pattern

`mysql explain` — read the access type and you've found the bottleneck

`mysql syntax` — the operators that mark out a MySQL-fluent candidate

`mysql transactions` — REPEATABLE READ is the default, and gap locks are real

`mysql interview questions and answers for experienced` — three themes that mark senior loops