Database Transactions
A database transaction is a logical unit of work comprising one or more operations. It ensures that all statements are executed completely; in the event of a failure, any partial changes are rolled back to maintain data integrity.
Transactions are not limited to write operations; they can also be read only, where only data retrieval (e.g., SELECT queries) is performed. Regardless of whether the operation is read or write, SQL Server treats each statement as its own transaction by default, committing it if successful or rolling it back if it fails.
-- *A basic MSSQL transaction example that ensures an order is fully created, or rolls back all changes if any system or user generated error occurs.*
-- READ COMMITTED is the default isolation level in SQL Server; setting it is optional and shown for clarity
-- Higher isolation levels are not required for this single-row atomic UPDATE (no range or read-before-write)
SET TRANSACTION ISOLATION LEVEL READ COMMITTED;
-- Transaction with TRY...CATCH
BEGIN TRY
BEGIN TRANSACTION;
-- To prevent "hangs" during simultaneous checkouts, you can use WITH (ROWLOCK, READPAST)
-- to skip rows currently locked by other users instead of waiting.
-- Note: Production logic may need to handle inventory shortages in a single row by iterating across multiple rows to fulfill the total quantity requested.
UPDATE TOP (1) Inventory
SET Quantity = Quantity - 1
WHERE ProductId = 101 AND Quantity > 0
ORDER BY ExpiryDate ASC; -- Deduct from oldest stock first
-- Check if update happened
IF @@ROWCOUNT = 0
THROW 50001, 'Out of stock', 1;
-- Create order
INSERT INTO Orders (ProductId, OrderDate)
VALUES (101, GETDATE());
COMMIT TRANSACTION;
END TRY
BEGIN CATCH
IF @@TRANCOUNT > 0
ROLLBACK TRANSACTION;
PRINT 'Error: ' + ERROR_MESSAGE();
END CATCH;
ACID Compliance
ACID compliance means that a database follows a set of rules to ensure transactions are processed reliably and safely. An ACID-compliant database guarantees that transactions behave correctly, data remains accurate, and failures or crashes do not corrupt data.
The 4 ACID properties as are follows -
Atomicity
Atomicity Ensures that a transaction does not leave the database in a partial state. Either all operations are successfully applied, or none are, and any intermediate changes are rolled back in case of failure.
The core of Atomicity is the Transaction Log (or Write-Ahead Log). Before any data is actually changed in the main database files, the system records the intended changes in a log.
Commit: If every step in the transaction block finishes without error, the database "commits," making the changes permanent.
Rollback: If a single step fails (due to a power outage, a crash, or a constraint violation), the database uses the log to "undo" any partial changes, returning the data to its exact state before the transaction started.
Example -
Imagine a customer placing an order for 5 units of a specific product. To complete this transaction accurately and prevent overselling, the system must perform two distinct operations in a strict sequence:
Inventory Deduction: Subtract 5 units from the
Productstable to secure the stock for this specific customer.Order Creation: Insert a new record into the
Orderstable to officially document the purchase and customer details.
The absence of Atomicity in this scenario can lead to critical data discrepancies, such as "Missing Stock," where a system crash occurring after the Inventory Deduction but before the Order Creation results in 5 units being removed from the available inventory without a corresponding order record to account for them. This failure leaves the database in an inconsistent state where your digital records reflect a lower stock level than what is physically present in the warehouse.
Consistency
Consistency ensures that a transaction brings the database from one valid state to another. All data written must follow the rules and constraints defined by the schema, such as data types, domain constraints, and referential integrity. If any operation violates these rules, the transaction is rejected or rolled back to prevent invalid data.
Examples:
Unique Constraints: A transaction must fail if it attempts to register a user with an email address that already exists in the database.
Foreign Keys: An order cannot be saved if it references a
CustomerIDthat does not exist in the Customers table.Data Types: An update will be rejected if it tries to insert a string of text into a column defined strictly for integers.
Balance Minimums: A withdrawal transaction must roll back if the resulting account balance would drop below the allowed minimum of $0. While databases can enforce simple checks, ensuring a withdrawal doesn't drop a balance below $0 is typically a matter of Application Logic enforced by the developer.
Apart from constraints, for a transaction to be consistent, it must not leave the database in an an inconsistent or corrupted state. To ensure this, transactions must be atomic—meaning all parts of the transaction succeed or the entire operation is rolled back—and isolated from each other, so that concurrent operations do not interfere with each other.
Isolation
Isolation refers to the ability to concurrently process multiple transactions in a way that one does not affect another. Changes occurring in a particular transaction will not be visible to any other transaction until that particular change in that transaction is written to memory or has been committed.
Database Transaction Phenomena
Database phenomena occur when concurrent transactions—multiple actions happening at once—interact in ways that lead to inconsistent or "incorrect" data. The degree to which a database prevents these phenomena is what defines its Isolation Level.
The four primary phenomena are:
- Dirty Read : A Dirty read is a situation when a transaction reads data that has not yet been committed.
- Non-Repeatable Read : A Non-Repeatable Read Occurs when a transaction reads the same row multiple times and gets different values. Even though each read returns only committed data, if another transaction modifies and commits the data between the reads, the subsequent read will return a different value.
- Phantom Read: A phantom read occurs when a transaction executes a query that reads rows matching a specific condition. If, between subsequent reads, another transaction inserts or deletes rows that meet that same condition, the result set changes. As a result, the transaction encounters "phantom" rows that were not present—or have disappeared—since the initial read.
- Lost Update: A lost update occurs when two concurrent transactions read the same record and then both attempt to update it based on the value they originally read. Because the second transaction is unaware of the first transaction's changes, it overwrites the first update, causing that data to be permanently lost.
Isolation Levels
Based on these phenomena, The SQL standard defines four isolation levels :
- Read Uncommitted : Read Uncommitted is the lowest level of transaction isolation. In this level, a transaction can read data modifications made by other transactions even before they are committed, leading to Dirty Reads. At this level, transactions are effectively not isolated from one another. Consequently, it provides no protection against Dirty Reads, Non-Repeatable Reads, or Phantom Reads, and it remains highly susceptible to Lost Updates.
- Read Committed : Read Committed is the default isolation level for many popular database systems, such as PostgreSQL and SQL Server. It guarantees that any data read by a transaction has been fully committed at the moment it is read, effectively preventing Dirty Reads. Under this level, the system ensures that a transaction never sees "in-progress" or uncommitted changes from others. However, because Read Committed uses short-lived read (shared) locks that are released as soon as the data is read, other transactions are free to update or delete those rows immediately after the read operation completes. Consequently, this level does not prevent Non-Repeatable Reads or Phantom Reads. It also remains susceptible to Lost Updates because a transaction can read a value and later overwrite it without realizing another transaction modified it in between. A standard way to prevent lost updates without needing higher isolation levels is through Atomic Updates, which handle the logic in a single SQL statement.
-- Atomic update that prevents lost updates and enforces business logic in one step
UPDATE Inventory
SET Quantity = Quantity - 5
WHERE ProductId = 101
AND Quantity >= 5; -- The WHERE clause ensures Application Consistency (no negative stock)
- Repeatable Read is a highly restrictive isolation level that ensures any data you read once will remain exactly the same if you read it again within the same transaction. To achieve this, the transaction typically holds Long-lived Shared Locks on all referenced rows until the transaction commits. This prevents other transactions from updating or deleting those rows until the transaction completes. This effectively eliminates Dirty Reads and Non-Repeatable Reads and helps prevent Lost Updates. However, because it does not lock all possible relevant rows, Phantom Reads are still possible.
-- This locks the existing rows (ProductIDs 1, 2, 3) for CategoryID 5, preventing updates or deletes.
-- It does NOT lock the "index range," so others can still insert a new ProductID 4 into that category.
SELECT * FROM Products WHERE CategoryID = 5;
- Serializable : Serializable is the highest and most restrictive isolation level. It guarantees that the final result of concurrent transactions is exactly the same as if they had been executed serially (one after another). However, in practice, the database does not literally stop all other queries to run them one by one. Instead, the database uses Range Locking (or Predicate Locking) to "lock the territory" of a query. This doesn't just protect the rows you can see; it extends protection to the entire range defined by your query's criteria. This prevents any other transaction from inserting, updating, or deleting any data, including data that doesn't exist yet that would fall within that range until your transaction is complete. While it provides the strongest guarantee of data integrity, it comes with a significant performance cost because it locks entire ranges of data.
- Snapshot Isolation : Snapshot Isolation ensures that each query in a transaction sees only the data that was committed at the start of that transaction. It functions like a consistent "snapshot" or version of the database captured at a specific point in time. Snapshot isolation is frequently used to critique the ANSI SQL-92 standard's definition of isolation levels as it exhibits none of the three primary 'anomalies' prohibited by the SQL standard (Dirty Reads, Non-Repeatable Reads, and Phantom Reads), without incurring the performance cost of a strictly Serializable implementation. While snapshot isolation prevents most common concurrency issues, it's worth noting that it differs from true serializable isolation in that it remains vulnerable to a phenomenon called write skew. A technical deep-dive into write skew reveals how business rules can be violated even when no direct row-level conflict occurs. In a scenario where two doctors are on call and the rule is "at least one must be on call," Doctor A may see that Doctor B is on call and sign off. Simultaneously, Doctor B sees that Doctor A is on call and signs off. Because both transactions read from a "snapshot" where both were present, the final state leaves zero doctors on call—violating the rule despite neither transaction directly modifying the same record at the same time. In this case, the records being modified are the individual availability statuses for Doctor A and Doctor B; because they update different rows, the database does not detect a conflict, even though their combined actions break the business logic. To prevent this issue, you can use the highest isolation level, Serializable, which ensures the final result of concurrent transactions is exactly the same as if they had been executed one after another, effectively eliminating the possibility of a "phantom" change or a logical conflict. Alternatively, you can use Pessimistic Locking through explicit locks to block other transactions from modifying the relevant rows until your transaction is complete. Finally, you may Materialize Conflicts by forcing both transactions to update a shared "parent" record, such as a Shift or Department row, which triggers the database's internal conflict detection mechanisms and forces one of the transactions to wait or fail.
It is worth noting that isolation levels do differ from database to database, as each DBMS may implement them differently and even choose different default levels, resulting in variations in how concurrency issues like dirty reads or phantom reads are handled.
Durability
Durability ensures that once a transaction is committed, its changes are permanently recorded in non-volatile storage (such as an HDD or SSD). This guarantees that data survives system failures, including software crashes or power outages. Once the system restarts, all committed data remains intact and accessible.
Implementation Methods
Database systems typically use one or a combination of the following techniques to ensure durability:
Write-Ahead Logging (WAL): Before any changes are applied to the actual database files, they are recorded in a dedicated log on disk. If a crash occurs, the system "replays" this log to restore the database to its last consistent state.
Snapshotting (Checkpointing): The entire state of the database is periodically saved to disk as a point-in-time image. While this allows for fast recovery, any data modified between snapshots may be lost unless it is paired with a logging mechanism.
Append-Only File (AOF): Every write operation is appended to a continuous log file. To recover, the database simply re-executes the sequence of operations in the file to reconstruct the data set.
Replication: Data is copied across multiple physical machines to ensure that a hardware failure on one node does not result in total data loss.
| Isolation Level | Read Strategy / Locking | Write (Exclusive) Locks | Scope of Lock | Dirty Reads | Non-Repeatable Reads | Phantoms | Lost Updates |
|---|---|---|---|---|---|---|---|
| Read Uncommitted | No Locks. Reads "live" uncommitted data. | Held until Commit. | Individual rows. | May occur | May occur | May occur | May occur |
| Read Committed | Short-lived. Locks released immediately (or versioned). | Held until Commit. | Individual rows. | No | May occur | May occur | May occur |
| Repeatable Read | Long-lived. Read locks held on rows until Commit. | Held until Commit. | Individual referenced rows. | No | No | May occur | No |
| Serializable | Range Locking. Locks all rows and gaps. | Held until Commit. | Entire range of the query. | No | No | No | No |
| Snapshot | Versioned. Reads from a "snapshot" at start. | Held until Commit. | Individual rows. | No | No | No | No |
Concurrency Control
Concurrency Control is the management procedure required to coordinate the simultaneous execution of transactions in a database. Its primary goal is to ensure data integrity and consistency when multiple users access the same data at the same time. This is typically achieved through two main strategies: Optimistic and Pessimistic concurrency.
Optimistic Concurrency Control
Optimistic Concurrency Control works by tracking the version of the data being modified rather than applying immediate locks. Instead of blocking other users, the system allows transactions to proceed independently and only checks for conflicts right before the changes are committed to the database. If it is discovered that another transaction changed the data while the current one was still processing, a conflict is detected, the transaction fails by rolling back, and the application must then retry the operation. This approach scales significantly better for applications with many users and high read-to-write ratios because it completely avoids the performance overhead associated with managing complex locks.
A common implementation of these optimistic principles is Snapshot Isolation, which allows a transaction to see a consistent "snapshot" of the data as it existed at the moment the transaction started. This mechanism enables reads and writes to occur simultaneously without blocking one another, typically utilizing internal row-versioning to detect and resolve any write-write conflicts during the final merge.
Pessimistic Concurrency Control
Pessimistic Concurrency Control assumes that conflicts between transactions are likely to happen and therefore blocks data records as soon as a user starts an update. Under this model, other users are unable to update or, in some cases, even read the data until the original lock is released. While this approach prevents the "wasted work" of a transaction failing at the very end, it can limit scalability and lead to performance bottlenecks if many users are competing for the same records. To manage these interactions, the system uses specific lock modes:
Shared Lock (S): Also known as a Read Lock, it allows multiple transactions to read a resource simultaneously but prevents any writing.
Exclusive Lock (X): Also known as a Write Lock, it is held by only one transaction and blocks all others from both reading and writing.
Update Lock (U): Acts as a precursor to an exclusive lock; it allows others to read the data but ensures only one transaction is "queued" to perform an update, which helps prevent deadlocks.
Song of the day: Dream Theater - Pull Me Under
Top comments (0)