📦 Data consistency, outbox pattern and idempotency in a microservice architecture
CAP Theorem
Late 90's, the scientist Eric Brewer presented for the first time the CAP Theorem. The theorem states the "two out of three" concept, any distributed system can provide only two of the following guarantees:
- Consistency: every request receives the most recent data or an error;
- Availability: every request receives a response, without the guarantee that it contains the most recent data;
- Partition tolerance: the system continues to operate despite an arbitrary number of messages being dropped (or delayed) by the network between nodes.
Considering a web app, where there is a network connection between a database and a back-end application, or even between different services in a microservice architecture, the app must be partition tolerant. This means that, even after the network is partitioned, the system still works correctly. Therefore, after a partition, it only remains to decide whether to do one of the following: cancel the operation to ensure the consistency, or proceed with the operation providing availability but risk inconsistency.
Let me give you an example to clarify the CAP theorem. Imagine an e-commerce that to the process of finishing an order, two services are involved: order and catalog. The order service has to check if there are products available calling the catalog API before finishing the order. If the catalog API is not available in that moment for any reason (a partition happened), the catalog API can behave two different ways:
[Consistency] Choose the strong consistency returning an error
[Availability] Giving up the strong consistency returning to the client that eventually the request will be processed
The pattern that will be discussed in this article is an eventual consistency pattern. So, the outbox pattern gives up the solid consistency for focus on availability.
Outbox pattern
The outbox pattern makes sense only for distributed systems, discussing it in a monolithic scenario is completely nonsense. The problem that this pattern solves is: how to reliably/atomically update the database and send messages/events?
The way as the pattern solves this problem is relatively easy to understand, basically it can be described in four steps:
- A service that persists data in a database, inserts also messages/events into a table (which is called outbox table) as part of the local transaction;
- The service appends the messages/events to an attribute of the record being updated;
- Another process, called
Message Relay, publishes the events inserted into the database to a message broker; - If something wrong happen, the
Message Relayprocess retry to send the event a few times until the set limit been reached; - The messages/events are stored in the consumer side too.
Edited version from: https://github.com/dotnetcore/CAP
So the outbox pattern would guarantee data consistency between the services, but what if the events are consumed twice? This is where idempotency comes in.
Idempotency
In a scenario with a broker at least once delivery the message could be persisted more than one time in two different situations:
- The producer had produced a message and sent it to the broker, the consumer stores the data in the database but don't return an
ackin a timely manner. Then, the broker concludes that the message was not processed sending the message again; - In the outbox scenario, the producer had stored the message in the
outbox tablefor the first time and sent it to the broker, but for some reason it wasn't able to update theoutbox tablesaying that the message was published. For that reason, it will keep sending the message again until theoutbox tablehad been updated.
NOTE: that could be even worse in a multiprocessing scenario
To turn your consumer in an idempotent one, you could register in the database the message/event ID that has been rightly processed. When the consumer is processing a new message, it would be able detect and discard duplicates.
Conclusion
The outbox pattern is an eventual consistency pattern that cares about the system's availability but is not a silver bullet. When using it you should be careful about double message consumption choosing an idempotent consumer approach for example.
There are many libraries in .NET that helps you implementing the outbox pattern like: MassTransit, NServiceBus, CAP. Talking about idempotency, a special mention to a specific lib from a big friend that runs on top of CAP which is called Ziggurat.
If you got until here and liked the article content, let me know reacting to the current post. You can also open a discussion below, I'll try to answer ASAP. Next article, I'll show you the code specifying all you need to build a system using outbox pattern and idempotency using .NET, CAP and Ziggurat. Hope you like it!
References
CAP Playground, 📤 Just playing a bit with CAP and outbox pattern
[PT-BR] JS+, Data consistency, outbox pattern and idempotency in a microservice architecture with .NET; JS+ TechTalks #22 - Edição Lisboa
Richardson, Chris; Pattern: Idempotent Consumer
Richardson, Chris; Pattern: Transactional outbox




Top comments (0)