Recently, I have been helping a client implement an event-sourced system. In the process, I put together a very simple demo app which is available on GitHub here.
This demo app uses the banking example where a user can:
- create an account
- check his/her balance
- withdraw money
- credit the account
DynamoDB is the datastore.
Events
Every time the account holder withdraws from or credits the account, I will record an event.
It means that when I need to work out the current balance of the account I will have to build up its current state from these events.
Snapshots
A common question people ask about event-sourced systems is “how do you avoiding reading lots of data on every request?”
The solution is to create snapshots from time to time. In this demo app, I ensure that there are regular snapshots of the current state. One snapshot for every 10 rows in the table, to be precise.
These snapshots allow me to limit the number of rows I need to fetch on every request. In this case, I have a constant cost of fetching 10 items every time.
Rebuilding the current state
To rebuild the current state, I find the most recent snapshot and apply the events since the snapshot was taken.
For example, given the below:
The most recent snapshot is Version
22, with a Balance
of 60. There have been 3 events since then. So the current balance is 60–10–10+10 = 50.
Here’s what it looks like in code:
Optimistic locking
To protect against concurrent updates to the account, the Version
attribute is configured as the RANGE
key. Whenever I add an event to the DynamoDB table, I will check that the version doesn’t exist already.
Optimizations
To bring down the cold start as well as warmed performance of the endpoints. I applied a number of basic optimization:
- enable HTTP keep-alive for the AWS SDK
- don’t reference the full AWS SDK
- use webpack to bundle the functions
Streaming events to other consumers
It wasn’t included in the demo app, but you can also stream these events to other systems by:
a) letting other services subscribe to the DynamoDB table’s stream
b) create another Kinesis stream, and convert these DynamoDB INSERT
events into domain events such as AccountCreated
and BalanceWithdrawn
.
My personal preference would be option b. It lets other consumers work with domain events and decouples them from implementation details in your service.
From here, you can also connect the Kinesis stream to Kinesis Firehose to persist the data to S3 as the data lake. You can then use Athena to run complex, ad-hoc queries over ALL the historical data, or to generate daily reports, or to feed a BI dashboard hosted in QuickSight.
Further reading
If you want to learn more about event-sourcing in the real-world (and at scale!), I recommend following this series by Rob Gruhl. Part 2 has some delightful patterns that you can use. You should also check out their Hello-Retail demo app.
Hi, my name is Yan Cui. I’m an AWS Serverless Hero and the author of Production-Ready Serverless. I have run production workload at scale in AWS for nearly 10 years and I have been an architect or principal engineer with a variety of industries ranging from banking, e-commerce, sports streaming to mobile gaming. I currently work as an independent consultant focused on AWS and serverless.
You can contact me via Email, Twitter and LinkedIn.
The post A simple event-sourcing example with snapshots using Lambda and DynamoDB appeared first on theburningmonk.com.
Top comments (2)
HI Yan, we love your blog and Twitter stream. We are working on a cross-cloud event bus (EveryBridge) so you can use events from virtually any service to trigger events. Including using our open source controller to forward events to Kubernetes. Keep up the great writing you are one of our favorites.
Hey, thank you!