DEV Community

Shredded Mustard
Shredded Mustard

Posted on

SAGA Pattern

When dealing with scalable applications, things can get messy quickly. If you don’t follow established programming principles and structured design patterns, adding new features can become hectic and time-consuming. You may not realize exactly when things went wrong, and you may be uncertain about the steps needed to fix them. This is especially true in distributed systems, where following well-structured design patterns is essential.

In distributed systems, a crucial concept in practice is Database per Service. This means each microservice has its own exclusive database, for which it alone is responsible, and no other microservice can access or modify it in any way. This provides significant programming benefits like improved abstraction, loose coupling, separation of concerns, and adherence to the single responsibility principle. However, this can lead to common challenges when managing transactions.

In traditional single-database systems, we often perform multiple database operations transactionally. A transaction has a beginning and an end, and as long as the transaction remains uncommitted (unfinished), everything done within it can be rolled back. All operations within a transactional context will either succeed together or fail together. Once we commit the transaction, there is no going back. This is known as an ACID transaction. It ensures consistency across our system and guarantees that all tables or entities are updated correctly, preventing misinformation. If we want to undo a committed transaction, we must manually perform another transaction to reverse it.

In distributed systems, things become more complex due to the principle of separation of concerns. Often, we cannot maintain a transaction across multiple microservices. The best solution to this is to avoid using transactions altogether, though designing a database to eliminate transactions entirely can be complex and demands critical thinking. However, complete avoidance of transactions is not always feasible.

There are mechanisms to maintain a single transaction across multiple microservices, such as two-phase commits, but these are often slow, require heavy implementation, and can become complex and impractical.

If strong consistency is essential for your application, you must adhere to atomic transaction rules. However, if you can allow for eventual consistency—where data becomes consistent over time rather than immediately—then there’s more flexibility.

A popular design pattern for handling this is the SAGA Pattern.

Consider a ticket reservation web application. A user logs in and wants to return a ticket they’ve already purchased. To process the return, the server must perform three operations:

  • Update the status of the Tickets to "Available"

  • Remove the ticket from the user’s acquired tickets list

  • Issue a refund request

If our application is monolithic, all operations would be performed in a single transactional context, meaning they would either all succeed or fail together. However, in a microservices setup, we must wait for each process to finish to determine success or failure. If the request fails, we need to manually roll back previous operations.

Let’s first visualize our scenario in a distributed system.
Image description

The user requests a refund through the API Gateway, which sends a request to the Tickets microservice. The Tickets service changes the ticket status and sends a request to the User Service. The User Service updates the user’s ticket list and sends a request to the Payment Service to issue the refund. The Payment Service communicates with a third-party payment gateway provider to transfer the funds back to the user and updates its own database. While the flow may vary depending on the implementation, the issue of transaction management remains the same.

If any request fails, we need to roll back previous operations to maintain consistency.

The SAGA Pattern

The SAGA pattern addresses this by introducing a rollback mechanism across microservices. It is commonly used in distributed systems where a transactional context spans multiple microservices. The SAGA pattern breaks down a transaction into a series of smaller, individual steps, each managed by a single service. Together, these steps form a distributed transaction. If a step fails at any point in the SAGA, compensating actions (or rollback steps) are used to undo previous actions, ensuring data consistency. This approach applies to transactions that follow a sequence of operations.

There are two types of SAGA patterns.

  1. Orchestration-Based

In an orchestration-based SAGA, a dedicated SAGA orchestrator service manages the transaction. The orchestrator knows which services to call in sequence. If one operation fails, all previous operations are rolled back, and the user is informed of the failure’s reason. To adapt our example to an orchestration-based SAGA, we’d add an Orchestrator service to control the flow.
The Orchestrator executes each step sequentially. If a step succeeds, it moves to the next. If a step fails, it rolls back all previous actions.

Here is how it goes
Image description

The Orchestrator service now has control over the entire process and knows how to handle any failure, allowing the user to see exactly what went wrong. However, this pattern introduces complexity and makes the orchestrator a single point of failure. Additionally, any new operation must be configured in the Orchestrator, including its rollback call.

To reduce this coupling, we can use the Choreography-based SAGA pattern with event-driven architecture.

  1. Choreography-Based

As the name suggests, the Choreography-based SAGA pattern lets multiple services work together in a choreographed manner, with no central coordinator. Each service involved in the transaction triggers the next step and listens for events to decide its next action.

If an operation fails, it triggers an event to roll back previous operations. For example, if the Users service fails to return the ticket, it publishes a "User Ticket List Update Failed" event, which only the Tickets Service listens to and uses to perform its rollback. If the Payment Service fails, it triggers a "Payment Refund Failed" event, which both the Tickets and Users services listen to, performing their respective rollbacks. To achieve this, we need a Message Broker.

Now our happy flow looks like this
Image description

Now to add the SAGA pattern mechanism
Image description

Now, in the successful flow, each service listens for the preceding service’s success event. For instance, the Users service listens for a "Ticket Return Successful" event, while the Payment Service listens for a "User Ticket List Update Success" event. After the Payment Service completes processing, it notifies the user by publishing to the notification service.

In the failure flow, each service listens for the failure event of the preceding service, triggering a chain of rollback events as needed.

SAGA Pattern review

The SAGA pattern divides a transaction into distributed steps, where a failure in one service triggers events that roll back previous steps, ultimately achieving consistency across the system.

The SAGA pattern is ideal for microservices architectures requiring eventual consistency and flexibility in managing long-running, distributed transactions, especially in domains like e-commerce, travel, banking, and other transactional applications. However, the added complexity in designing and monitoring SAGA workflows should be considered based on your application’s requirements.

When dealing with distributed systems, cross-microservice transactions should generally be avoided. While design solutions exist, they are often complex and add application overhead. Unlike in monolithic transactions, rolling back a distributed transaction is not straightforward and requires careful implementation and thorough testing. Therefore, whenever you apply such a pattern, proceed with caution and thoroughly test your implementation.

The next pattern we will review is CQRS

Top comments (0)