A distributed transaction is a set of operations on data that is performed across two or more databases. It is typically coordinated across separate nodes connected by a network, but may also span multiple databases on a single server.
Why do we need distributed transactions?
Unlike an ACID transaction on a single database, a distributed transaction involves altering data on multiple databases. Consequently, distributed transaction processing is more complicated, because the database must coordinate the committing or rollback of the changes in a transaction as a self-contained unit.
In other words, all the nodes must commit, or all must abort and the entire transaction rolls back. This is why we need distributed transactions.
Now, let's look at some popular solutions for distributed transactions:
Two-Phase commit
The two-phase commit (2PC) protocol is a distributed algorithm that coordinates all the processes that participate in a distributed transaction on whether to commit or abort (roll back) the transaction.
This protocol achieves its goal even in many cases of temporary system failure and is thus widely used. However, it is not resilient to all possible failure configurations, and in rare cases, manual intervention is needed to remedy an outcome.
This protocol requires a coordinator node, which basically coordinates and oversees the transaction across different nodes. The coordinator tries to establish the consensus among a set of processes in two phases, hence the name.
Phases
Two-phase commit consists of the following phases:
Prepare phase
The prepare phase involves the coordinator node collecting consensus from each of the participant nodes. The transaction will be aborted unless each of the nodes responds that they're prepared.
Commit phase
If all participants respond to the coordinator that they are prepared, then the coordinator asks all the nodes to commit the transaction. If a failure occurs, the transaction will be rolled back.
Problems
Following problems may arise in the two-phase commit protocol:
- What if one of the nodes crashes?
- What if the coordinator itself crashes?
- It is a blocking protocol.
Three-phase commit
Three-phase commit (3PC) is an extension of the two-phase commit where the commit phase is split into two phases. This helps with the blocking problem that occurs in the two-phase commit protocol.
Phases
Three-phase commit consists of the following phases:
Prepare phase
This phase is the same as the two-phase commit.
Pre-commit phase
Coordinator issues the pre-commit message and all the participating nodes must acknowledge it. If a participant fails to receive this message in time, then the transaction is aborted.
Commit phase
This step is also similar to the two-phase commit protocol.
Why is the Pre-commit phase helpful?
The pre-commit phase accomplishes the following:
- If the participant nodes are found in this phase, that means that every participant has completed the first phase. The completion of prepare phase is guaranteed.
- Every phase can now time out and avoid indefinite waits.
Sagas
A saga is a sequence of local transactions. Each local transaction updates the database and publishes a message or event to trigger the next local transaction in the saga. If a local transaction fails because it violates a business rule then the saga executes a series of compensating transactions that undo the changes that were made by the preceding local transactions.
Coordination
There are two common implementation approaches:
- Choreography: Each local transaction publishes domain events that trigger local transactions in other services.
- Orchestration: An orchestrator tells the participants what local transactions to execute.
Problems
- The Saga pattern is particularly hard to debug.
- There's a risk of cyclic dependency between saga participants.
- Lack of participant data isolation imposes durability challenges.
- Testing is difficult because all services must be running to simulate a transaction.
This article is part of my open source System Design Course available on Github.
karanpratapsingh / system-design
Learn how to design systems at scale and prepare for system design interviews
System Design
Hey, welcome to the course. I hope this course provides a great learning experience.
This course is also available on my website and as an ebook on leanpub. Please leave a ⭐ as motivation if this was helpful!
Table of contents
-
Getting Started
-
Chapter I
-
Chapter II
-
Chapter III
- N-tier architecture
- Message Brokers
- Message Queues
- Publish-Subscribe
- Enterprise Service Bus (ESB)
- Monoliths and Microservices
- Event-Driven Architecture (EDA)
- Event Sourcing
- Command and Query Responsibility Segregation (CQRS)
- API Gateway
- REST, GraphQL, gRPC
- Long polling, WebSockets, Server-Sent Events (SSE)
-
Chapter IV
- …
Top comments (0)