This article was originally published on bmf-tech.com.
Overview
This post summarizes transaction anomalies in MySQL. The MySQL version assumed is 8 series.
Test Environment
The environment used for testing is prepared with docker-compose. (Although it's just one container, so you don't necessarily need to use compose...)
.
├── docker-compose.yml
└── initdb.d
└── 1_schema.sql
docker-compose.yml
version: '3'
services:
mysql:
image: mysql:8.0.33
environment:
MYSQL_ROOT_PASSWORD: password
MYSQL_DATABASE: example
TZ: "Asia/Tokyo"
command: mysqld
ports:
- 3306:3306
volumes:
- ./initdb.d:/docker-entrypoint-initdb.d
1_schema.sql
CREATE DATABASE IF NOT EXISTS example;
DROP TABLE IF EXISTS `users`;
CREATE TABLE `users` (
`id` int(11) UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY,
`name` varchar(255) NOT NULL UNIQUE
) ENGINE = InnoDB DEFAULT CHARSET = utf8;
You can prepare a MySQL 8 series container with docker compose up.
Transaction Isolation Levels
MySQL's InnoDB provides four transaction isolation levels as defined by the ANSI/ISO SQL standard.
| Isolation Level | Dirty Read | Inconsistent Read | Lost Update | Phantom Read |
|---|---|---|---|---|
| READ UNCOMMITTED | ○ | ○ | ○ | ○ |
| READ COMMITTED | × | ○ | ○ | ○ |
| REPEATABLE READ※1 | × | × | ○ | ○※ |
| SERIALIZABLE | × | × | × | × |
※1 REPEATABLE READ is the default in MySQL.
※2 Although marked as ○ above, MySQL is designed to prevent phantom reads in REPEATABLE READ.
The transaction isolation level ranges from READ UNCOMMITTED, the lowest, to SERIALIZABLE, the highest. The above table is ordered from lowest to highest. Generally, the higher the isolation, the lower the performance tends to be.
For more on transactions, see Transaction Overview.
Anomalies
Let's reproduce transaction anomalies in MySQL.
An anomaly refers to "unexpected results or inconsistencies arising from transaction isolation levels or processing order."
There are anomalies defined by ANSI SQL standards or ISO/IEC 9075, and there are various others besides those discussed here.
Inconsistent read is not defined by those standards. (I couldn't find where it is defined...)
Transactions are denoted as TX. Numbers are used to distinguish multiple transactions (e.g., TX1, TX2).
Dirty Read
Dirty read is a phenomenon where TX1 reads data from TX2 before TX2 commits.
Test
All sessions are conducted with READ UNCOMMITTED.
mysql> SET SESSION TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;
- Start transactions in TX1 and TX2
// TX1
mysql> START TRANSACTION;
// TX2
mysql> START TRANSACTION;
- Add data in TX2
// TX2
mysql> INSERT INTO users(name) VALUES('foo');
Data is added in TX2, but not committed.
- Read data again in TX1
// TX1
mysql> SELECT * FROM users; // 1 row in set
TX1 reads data from TX2 before TX2 commits.
Inconsistent Read
Inconsistent read is a phenomenon where the data being read lacks consistency.
Refer to Various Anomalies#Inconsistent Read Anomaly.
I wasn't sure about the exact definition, so my understanding might be questionable...
Since it's about inconsistency after commit, inconsistent read seems like a higher concept than fuzzy read or phantom read?? But strictly, it should be different...
Test
All sessions are conducted with READ UNCOMMITTED.
mysql> SET SESSION TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;
- Start transaction and read data in TX1
// TX1
mysql> START TRANSACTION;
mysql> SELECT * FROM users; // Empty set
- Start transaction and add data in TX2
// TX2
mysql> START TRANSACTION;
mysql> INSERT INTO users(name) VALUES('foo');
mysql> COMMIT;
- Read data again in TX1
// TX1
mysql> SELECT * FROM users; // 1 row in set
The result differs from the initial read, confirming inconsistency due to TX2's actions.
Fuzzy Read (Non-repeatable Read)
Fuzzy read is a phenomenon where TX1 can reference data updated by another TX2.
All sessions are conducted with READ COMMITTED.
mysql> SET SESSION TRANSACTION ISOLATION LEVEL READ COMMITTED;
Test
- Start transaction and read data in TX1
// TX1
mysql> INSERT INTO users(name) VALUES('foo'); // Initial data entry
mysql> START TRANSACTION;
mysql> SELECT * FROM users; // 1 row in set
Initial data entry result.
+-----+------+
| id | name |
+-----+------+
| 1 | foo |
+-----+------+
- Start transaction and read data in TX2
// TX2
mysql> START TRANSACTION;
mysql> UPDATE users SET name = 'bar' WHERE id = 1;
mysql> COMMIT;
mysql> SELECT * FROM users; // 1 row in set
Update is complete.
+-----+------+
| id | name |
+-----+------+
| 1 | bar |
+-----+------+
- Read data again in TX1
// TX1
mysql> SELECT * FROM users; // 1 row in set
TX1's read result changes due to TX2's commit.
+-----+------+
| id | name |
+-----+------+
| 1 | bar |
+-----+------+
Phantom Read
Phantom read is a phenomenon where data read by TX1 changes if TX2 commits an addition or deletion. Fuzzy read involves updates, while phantom read involves additions or deletions.
Test
All sessions are conducted with READ COMMITTED.
mysql> SET SESSION TRANSACTION ISOLATION LEVEL READ COMMITTED;
- Start transaction and read data in TX1
// TX1
mysql> START TRANSACTION;
mysql> SELECT * FROM users; // Empty set
- Add data and commit in TX2
// TX2
mysql> START TRANSACTION;
mysql> INSERT INTO users(name) VALUES('foo');
mysql> COMMIT;
mysql> SELECT * FROM users;
Addition is complete.
+-----+------+
| id | name |
+-----+------+
| 1 | foo |
+-----+------+
- Read data again in TX1
// TX1
mysql> SELECT * FROM users; // 1 row in set
TX1's read result changes due to TX2's commit.
+-----+------+
| id | name |
+-----+------+
| 1 | foo |
+-----+------+
Lost Update
Lost update is a phenomenon where a conflict occurs when TX1 and TX2 update the same data, resulting in some updates being lost.
Test
All sessions are conducted with REPEATABLE READ.
mysql> SET SESSION TRANSACTION ISOLATION LEVEL REPEATABLE READ;
- Start transaction and read data in TX1
// TX1
mysql> INSERT INTO users(name) VALUES('foo'); // Initial data entry
mysql> START TRANSACTION;
mysql> SELECT * FROM users; // 1 row set
- Start transaction and read data in TX2
// TX2
mysql> START TRANSACTION;
mysql> SELECT * FROM users; // 1 row set
- Update data in TX1 and TX2
// TX1
mysql> UPDATE users SET name = 'tx1' WHERE id = 1;
// TX2
mysql> UPDATE users SET name = 'tx2' WHERE id = 1;
- Commit TX1 and TX2
// TX1
mysql> COMMIT;
// TX2
mysql> COMMIT;
- Read data
mysql> SELECT * FROM users; 1 row set
TX1's commit is lost, and TX2's commit is reflected.
+-----+------+
| id | name |
+-----+------+
| 1 | tx2 |
+-----+------+
Summary
The anomalies that occur vary depending on the transaction isolation level.
Anomalies are patterns where data reading and consistency change before and after commit.
To learn more about transaction anomalies, it might be better to refer to a book or something related to transactions.
Top comments (0)