DEV Community

Satej Bidvai
Satej Bidvai

Posted on

Why ACID properties matter in database transactions

👋 Hey! You might have come across the term 'ACID properties' in your studies. While it's commonly regarded as a topic to mug up for interviews 😶, there's actually a lot more to understand.

I have tried to provide an overview and touch upon some key concepts that will help you understand the significance of these properties for a database system ⚡️.


What is a Transaction? 💰

  • A transaction is a group of queries that together make up a unit of work in a database system.
  • Eg. Transferring funds from one bank account to another.

    (Query 1) Deduct money from the source account
    (Query 2) Add money to the destination account


Atomicity 🎯

  • A transaction should be all or nothing - if any part of the transaction fails, the entire transaction should fail and roll back 🚫

Consistency 🤝

  • The correctness of data should be maintained both before and after a transaction is completed ✅
  • There are different types of consistency in database systems. Some of them are:
    1. Data Consistency: This is the correctness of the actual data stored in the database. The data should follow a set of rules and constraints. This can be achieved via foreign key constraints, unique constraints, data type constraints, etc.
    2. Application Consistency: This is the consistency for the application using the database. It refers to the data displayed to the users, data passed between different systems, etc. This ensures that the application’s output is always correct and reliable.
    3. Eventual Consistency: Data stored across multiple nodes and servers will eventually have the same value, but may experience temporary inconsistencies. This is because when data is modified, the change needs to be communicated to all copies of the data.

Isolation 🚧

  • Transactions should be executed in isolation from one another, so that each transaction appears to be the only one executing on the database at a given time 👥
  • Depending on the isolation level, a transaction may read uncommitted changes which may lead to inconsistencies. Some of these read phenomena are:
    1. Dirty Reads: When a transaction reads data that has not been committed yet. This creates a problem when the uncommitted data is rolled back.
    2. Non-Repeatable Read: When a transaction reads the same piece of data multiple times and gets different results each time. This may occur due to modification of data between the reads (Unlike dirty reads, the data has been committed).
    3. Phantom Read: When a transaction reads rows satisfying a condition (e.g. 100 < balance < 200), but a second transaction inserts or deletes a row that also meets the condition, the result may be inconsistent when the same set of rows is read again.
    4. Lost Updates: When multiple transactions attempt to update the same piece of data concurrently, the updates may be overridden and the data may be lost.
  • To fix these read phenomena, various isolation levels are used in database systems:
    1. Read Uncommitted: Transactions can read all changes, even those that have not been committed yet.
    2. Read Committed: Transactions can only read committed changes. This solves the “dirty read” phenomena.
    3. Repeatable Read: Transactions can read the same piece of data multiple times and get the same results each time. This solves the “non-repeatable read” phenomena, but “phantom reads” may still occur.
    4. Snapshot: Transactions read data from a snapshot of the database taken at the start of each transaction. This prevents any modifications by other transactions to be visible in the current transaction. This solves all read phenomena.
    5. Serializable: Concurrent transactions are executed in a manner as if they were executed serially, one after the other. This is the highest isolation level.

Durability 💪

  • The changes made to the database after a successfull transaction should be persisted even in case of system failures 📉
  • Durability can be achieved by WAL (Write Ahead Logging). In this method, the changes to the database are written into a log file before writing to the disk.
  • When data needs to be written to a disk, the Operating System (OS) stores it in cache first, then does a bulk write to reduce the number of I/O operations. However, this approach is risky for database systems. To guarantee durability, databases use the fsync() system call, which forces a write to the disk instead of the OS cache.

Top comments (0)