Event sourcing is a software architecture concept that's based around the idea that instead of focusing on persisting the state of your application, you should persist the stream of events which got it into it's current state. The classic example is a bank ledger. Instead of storing the value of each account at the current moment and updating those values, instead you store each transaction (event), and the value in the account is just a projection of those events.
Auditability is the most obvious benefit of event sourcing, but it also gives you a lot of flexibility. You can go back and "query" the event stream to build up new projections of the original data you never would have thought to. Imagine a banking system where the values of an account were just stored as mutable entries in a database. Even ignoring the lack of an audit trail, you could never go back and ask questions like "what day do most of our transactions occur on". By storing the event stream, you can answer these questions even if you never thought to ask them when you designed the system.
Of course, these benefits don't come without downsides. An obvious one is storage. Since each event must be added to an append-only data store, the data will grow forever. Snapshots and retention policies can mitigate this but at the cost of loosing the ability to query back to the beginning of time.
The next challenge is that you must "replay" each event to build up the state of the application. Typically this happens on application startup. As the event store grows, this can become a performance concern. Again, snapshots of the state can easily reduce the pain here by limiting the number of events which must be replayed.
Possibly the biggest concern though is that only the events are stored in the log. The validity of the system is based on the idea that you should always be able to replay the events from the beginning of time and reach the same state your application was in before. But this means that the code which projects those events into their current state must be effectively immutable. If a feature is deprecated, it's code cannot be removed because it is still part of the history of the application and is needed in order to return the application to it's current state. Even a simple bug fix that affects the replay of events can cause a butterfly effect as it runs against every transaction since the beginning of time. This challenge is more fundamental to the concept of event sourcing and is more difficult to overcome.
Learning from the 15th Century
Going back to the example of the ledger which is usually the prime example of event souring, these problems aren't new. In the 15th century, Venice found itself the financial capital of the western world. The world was moving beyond simple face-to-face business transactions, but lacked a means of effective accounting. International trade relies on credit, and credit relies on accounting.
In 1494 Luca Pacioli, a friend of Leonardo da Vinci, published Summa de Arithmetica which described the concept of double-entry bookkeeping. The idea was fairly simple. Each transaction would be logged along with the new balance. This was essentially the concept of showing your work while doing a math problem in school. The idea revolutionized accounting and Pacioli is known to this day as the father of modern bookkeeping. Venetian merchants could now track inventory, balances over time, and track credits and debits.
Source: Planet Money: A Mathematician, The Last Supper And The Birth Of Accounting, which is based on "Double Entry: How the Merchants of Venice Created Modern Finance" by Jane Gleeson-White)
Double Entry Accounting and Event Sourcing
Effectively, Pacioli invented the idea of event sourcing. By storing the transactions, Venetians could audit their data and use their transactions to answer questions to help them run their business. This is the same idea behind using event sourcing to power business intelligence in the digital world. Double-entry bookkeeping remains a cornerstone of modern accounting.
But Pacioli's double-entry system had one feature that most event sourcing systems today lack — the double part of double entry. In event sourcing, events are persisted, and state is just a projection of those events. But in double-entry bookkeeping, transactions (events) are stored alongside the new balance (state). From any point, you can check the math and pinpoint any mistakes. If a mistake is found, the record of that mistake persists and a later transaction to reverse the mistake can be added. If the only record of a mistake was that the ending balance was wrong at the end of a long stream of transactions, correcting that mistake accurately would be almost impossible.
One of the most common event sourcing frameworks you use every day takes this approach. A git commit stores a message which describes the intended changes, and a SHA1 hash which acts as a pointer to the new state of the repository. These are essentially parallel to the transaction and balance lines in a transaction ledger. This design makes git fast, resilient, and auditable. You can go back through and audit any commit to pinpoint when a bug was added, and quickly view the repository's state at any point. Looking through and seeing that a line of code doesn't match with the commit message is no different than going through a company's accounting books and checking that a transaction was correctly applied. Like any event sourced system, you can also go through and ask new questions of the original events such as trying to see who contributed the most patches during a given period.
Event sourcing has the power to revolutionize software development much as the advent of modern accounting revolutionized the economy of the 15th century. Event sourcing has challenges, but most of the challenges it faces are not new, or even unique to software. Double-entry bookkeeping can be an important concept in event sourcing. By tracking the new state alongside each transaction, bug fixes can become less dangerous, the need to replay events to restore state on startup vanishes, and you retain the ability to audit your system and glean new information from the event stream. Snapshots in event sourcing systems are usually an afterthought for startup performance, but there can be significant benefits to treating them as a first-class citizen alongside the events.
Top comments (0)