Event Sourcing: What it is and why it's awesome

Barry O Sullivan on August 29, 2017

At the last PHPDublin meetup I was asked "What do you do?" and as usual the answer boiled down to "I design and build event sourced applications"... [Read Full]
markdown guide
 

My current project is event sourced. I agree it is awesome for all the reasons you describe. I would also add that you can start small and progressively optimize. Small teams don't have to opt into eventual consistency, process managers, snapshots, etc. from the get-go. These can be evolved later as needs arise.

Additional challenge for ES: set validation -- i.e. unique email. To say an email is unique creates a relationship between that email and every other email in the set (mutual exclusivity). Log storage is very inefficient at this type of verification, since you have to replay the entire log every time to validate it.

The party line on set validation has been to keep the set data with the other eventually-consistent read models. Then in the rare case a duplicate email address slipped in before the eventually consistent data got updated, detect the failure (could not add an email to the set), and compensate (notify an admin).

There is also a school of thought that set-based data should just be modeled as relational data, not event sourced. The main loss there is the audit log. Temporal tables could be used to get an audit log, but they are generally more of a pain to work with. That's the tricky part of being an architect: weighing the trade-offs and picking ones which are best for your product.

 

Agreed, the jury seems to be out on set validation. For that we're using immediately consistent read models, ie. as soon as a UserCreated event is fired we update the projection. This is not an optimal solution though, we have to force all CreateUser usecases to run sequentially. We can't have two running at the same time, or we can't enforce the constraint.

If you go with this solution you have to reporting in place to warn you if you're hitting the limit of sequential requests, otherwise requests will start failing.

Once we get these warnings, we're gonna move to the eventual consistent model you mentioned above, and just accept that there could be duplicates.

This isn't a problem unique to event sourcing either. Standard table driven web apps have the exact same problem when it comes to uniqueness across a set, it's just not as obvious that the issue exists. I think it's because DDD and ES forces you to model your constraints explicitly, while in standard web dev it's not as obvious that you're not actually enforcing your set wide constraints.

 

So I have a line of thought here. Most event sourced systems already have fully-consistent read models in addition to the log itself. Namely, the tables used to track events by aggregate or by event type. They could be reasonably reconstructed from the event log, but they are there as a convenience (an index really) and updated in a fully consistent manner with the event log.

I wonder if it is necessary then that all my read models be eventually consistent. Maybe some of them (especially set-based data) could be fully consistent write models that also get propagated to the read side (if separate databases are used for scaling reads... which is the whole reason for eventual consistency anyway). Then you could validate the set constraint at write time, and even guarantee that duplicates cannot be written (unique index will raise an error if violated while the use case is running) even if the use case checked first, but missed the duplicate due to concurrency. Then there would be no need to make the use cases sequential.

You can definitely do that, and it will work. A unique index will still force sequential operations, but only on that SQL call, rather than the entire usecase, so the likely hood of failure is much smaller.

The only issue is that you have to handle failing usecases. Say the email was written to the read model at the start of the usecase, but some business operation failed and the event was never actually stored in the log. Now you have a read model with invalid data.

If your event log and constraint projections are stored in the same SQL DB, you can solve this with a transaction around the entire usecase (we do this).

There is still the issue of potential failure. Ie. too many requests hitting the DB, but it's not one you need to deal with at the start. You'll only face this when you have massive numbers of people registering at once. We're currently adding monitoring for this, just to warn us if we're reaching that threshhold. It's unlikely we'll reach it, we're not building Facebook/Uber, but it's reassuring to know it's there.

Yes, I was assuming same database as it is the only convenient way to get full consistency. With separate databases, you pretty much have to use eventual consistency. (Well, distributed transactions exist too, but become problematic under load.)

 

"Validation" is not an issue on its own. This is just a subset of eventual consistency issue. Generally speaking, CQRS is suffering from this, not necessarily the event-sourcing.

 

I disagree and don't think this is a problem specific to those patterns. I think it is a human nature problem of getting a new data hammer and then every bit of data looks like a nail. It's part of how we learn appropriate uses of the tools, but it's nice when somebody can save you the trouble of learning the hard way.

This is not what I meant. Essentially, the "validation" issue comes each time you have any bit of eventual consistency, where you have to check the system state in a potentially stale store. CQRS is more often eventually consistent and the question of validation comes as a winner in the DDD-CQRS-ES mailing list and on StackOverflow. However, event-sourced system can have a fully-consistent index on something that needs to be validated and therefore this problem can be removed.

I understand what you are saying, but I didn't want to conflate the issue with the universe of DDD/CQRS things. I just wanted to point out that sets are one of the things that log based storage does not do well. Go ahead and look at relational tables for that. Either as the source of truth for set data or -- and I like how you put it -- as an index from log storage.

 

If your model fits ES+CQRS then go for it. If your model isn't easily described around mutations then don't do it. You can get a lot of power described above by implementing DDD and CQRS without ES. Load an aggregate, call actions to mutate the aggregate and save. You don't lose consistency, you don't need the extra complexity and you still have a single point to generate projections from. For many small systems, ES is no required.

PS: I've design and built insurance systems using EventStore. Insurance policies do run on events. However, for other systems it would be a complex sledge hammer to crack a nut.

 

Hi Rob, you've clearly got experience working with EventSourced systems, though I'd disagree about stopping at CQRS. CQRS was conceived as a stepping stone to full ES, rather than a destination in and off itself. In CQRS, you now have two write models, the aggregate state and the events. Leading to the question, which one is the source of truth? How do you ensure that the aggregate state is not out of the date with the event stream and vice-versa? Event in a small system this can get confusing. I prefer embracing full ES on the command side, it's simpler and forces you to think temporally.

You do bring up an excellent point around the complexity of reading events to build projections, for small systems this can be killer. I've implemented a similar version to your suggestion; when new events are stored, you also store the aggregate state. It is treated purely as a read model, the aggregate never loads or uses this model,it just writes to it. This gives you something quick to query and to join on for small projections. It's a best of all worlds approach.

 

I'm sorry but I don't agree. CQRS, as defined by Martin Fowler (originally by Greg Young but his job now is to sell Event Store, so accept his bias) is a separation of the read and write models, nothing more. You do not have two write models, you have one write (source of truth) and one read. The write model can be document or relational model (whatever fits best). It doesn't need to be an event system at all. There is no event stream to keep "in check with".

When you update the aggregate, you can have synchronous and asynchronous pub/sub that update the read side projections. A command, after all is just an object with the changes. The aggregate then consumes the command however your framework decides. That could be to put it into an event stream, straight into an eventing database, or into the loaded aggregate, which in turn knows how to mutate itself before saving.

There are plenty of .NET libraries that show a perfectly good CQRS system that does not need events. For example Brightr brightercommand.github.io/Brighter/

Link to Fowler: martinfowler.com/bliki/CQRS.html

I see your point now, I made the assumption that we were talking about a CQRS event driven system, as opposed to a CQRS system where the query side reads the aggregate state and then updates itself based on that.

I suppose our point of difference is that I prefer modelling through events rather than structure, whereas you feel this is overkill for simple systems.

For me, even in seemingly simple* systems, events allow me to see the actual process, instead of a structural model that implicitly contains that process. I can still model it in a mutable tree structure, but why bother, when the events do it better and more explicitly?

*Caveat, if the simple system is a supporting domain, then ES and even CQRS are probably overkill, I'm not a fanatic (well, only a little bit) :)

No, sorry, you don't understand. Query loads a projection, which is built from the aggregate. You shouldn't load the whole aggregate to query; if you do it's not really CQRS. You denormalise your whole data model; storing projections to minimise impedance mismatch and to maximise query. Quite often that means rdbms for command side and docdb for query. Your infrastructure updates the projections in the docDb a/synchronously using publisher/subscriber model on the aggregate root.

Secondly, it's not about simplicity, it's about the right data model for the right problem. CQRS+ES is just another tool to fit certain problems. I've seen ridiculously complex calculator APIs written in C that run pretty much in memory, dumping usage stats out to postgres. Highly available and enterprise but not simple and not CQRS worthy either.

Don't fixate on one architecture. If you do, you'll find yourself trying to force everything into that form; even when it doesn't fit.

Commands model the process, not the events. The events are the data as stored. Commands create events. You don't model your domain around events but around commands. If you asked a client to write down user stories to work out your aggregates then your end up with commands first (what they are trying to do) then the events to hold the data comes next. Group commands together and you get aggregates.

How the commands then persist the data isn't important. Whether you hydrate your aggregate through events and handlers or through a document/rdbms load is up to the best fit model. Event Store, for example, assumes that your events are immutable. Fine for some applications but not all.

I hope that makes sense.

Hi Rob, thank you for that. I hadn't thought of modelling a CQRS system like that, it's a really nice solution that solves a lot of problems. There's a lot of food for thought above and I appreciate you taking the time to lay it all out for me, and those reading.

The problem that I have is the statement that events, and I then assume that you mean commands also, are not necessarily immutable. Events happen in past tense. They are by definition immutable. A command is issued, whether or not anything happens because of it is irrelevant. The command was issued. Immutable.

 

Super interesting writeup, this is the first time i hear of Event Sourcing and it would be really interesting to see a sample project that uses this, like an actual cart as in your example. Do you know of any?

 

Hi Mikael,

I don't have any examples to hand, though there are plenty of them online. The thing about ES, is that everyone has a different way of applying it and structuring their code, so there isn't an example I can point to and say "this is how you should do it".

I may write up an example project in future, if I do, I post a link here.

 

Hi Barry,

Your writting of the Oreilly Event Sourcing Cookbook is progressing fine ? :-)

Learning how to think event sourcing, what pitfalls to avoid, the smart tricks to keep, etc..

That'd be very valuable.

Event sourcing to the noob sounds like sex to the graduate, it is too much fun not to try it :-)

Stephane

 

Thanks for the reply!

I checked out geteventstore.com/ and got a good grasp of it, It seems very similar to worker queues that I'm currently working with (Via kr.github.io/beanstalkd/).

Though, no one seems to recommend building a full CRUD app using ES.

Oh yeah, there are times were CRUD is a better fit than ES. If the solution is simple, once off and not the core to your business, building a CRUD app is fine. So CRUD is a workable implementation for a todo list app that's only used by a handful of people internally in the business

If it's anything more complex than that (and most things are), or it's something that is crucial to the success of your business, ES is a better fit. It forces you to understand your domain and it's language, rather than throwing an extra column into a table to hack the a solution in.

To give another example, if you need a blog for your business, for some basic marketing, CRUD is usually fine. If your business is about blogs, understanding how they work and how people use them, then ES is a better fit.

Hope that helps.

 
 

Actually, I just remembered, a friend of mine, @lyonscf , has written a really solid ES example of a Shopping Cart.

github.com/boundedcontext/bounded-...

It's in PHP and written using a framework we co-authored. Look at "Aggregate.php" in a Aggregate/Cart to see the events being applied, and the invariants (synonym for constraint) being checked/enforced. Hope you find it helpful!

 

This is exactly what I was hoping for and more, thanks! I'll look through it and test it out.

 

I found this example at Microsoft docs.microsoft.com/en-us/azure/arc..., but sadly no code. Maybe it is of use for you nevertheless. ;)

 

I find Event Sourcing really interesting, but there are a lot of questions that come into my mind as a newbie in this world. :)

Using this approach requires you (the developer) to model your database using tables to store "events" instead of "entities". Suppose you want to maintain a long list of customer data and retrieve the full list of customers, is it not too slow scanning the entire event table to rebuild the current status of all the available customers?

I know you can do snapshots, but does this approach require you to take snapshots too much often in order to keep everything fast?

 

Hey Marco,

Glad to answer. In Event Sourcing, you wouldn't have tables per model, you would have one table that stores all the events for your system, this would is called the event log.

If you're using mysql, you'd typically encode the event itself as JSON and just store. You'd also have columns for storing the aggregate (root entity) and the aggregate id (root entity id), so you can easily fetch the subset you need when validating business operations.

Now, this is not optimal for your proposed usecase "retrieve the full list of customers". This is where projections come in. You would create a read model (projection) that is built from all the "customer" events. Everytime a "customer" event is fired, this read model is listening and updates itself.

Hope that answers your question.

 

Thanks for your really clear answers, Barry.

It seems that for any "trouble" you might encounter applying an ES-based model, there is a workaround to balance disadvantages with benefits.

I will deepen the subject to know more, since the greater obstacle (for me) is more a matter of "mind shifting" than technical shortage. :)

Thanks again!

 

So the aggregate is not the data in the projection ? What is this aggregate and its aggregate id then ?

 

Would I be right in assuming that if you want to do a query on anything that isn't the aggregate id you'd have to project the entire set for the type in question first?

That's exactly how you'd do it. There are lots of strategies for this, it depends on your usecase.

 

Great article, Barry. Thank you.

I've been looking at ES for a while now. We tend to do DDD in PHP and actually have our aggregates release events to act as our audit log and for event-driven messaging between bounded contexts. However, your article raised a couple of questions for me.

First, re. projections you said:

"you build it in the background, storing the intermediate results in a database"

How do you do this in your PHP applications? The only two options I can think of are:

  • to have some watcher PHP process constantly polling your event stream and checking for new events (e.g. it internally stores "I last processed event 7481" and passes any new events on to one or more registered projectors)
  • to have projections happen synchronously when the event is generated i.e. a request comes into the application, PHP takes the request and calls methods on your aggregate which creates events, those events are persisted (in a transaction, in case any of the projections fail) and then that event is passed on to one or more registered projectors.

Do you do either of these or do you do something else (maybe involving a message bus of some kind)?

Second, you said:

"You load the events and replay them in memory, "projecting" them, to build up your dataset"

What do you mean by "data set"? Do you mean an instance of a PHP class (i.e. take a set of events for a given aggregate, pass them all to the apply(Event $e) (or similar) method on that aggregate's class and then work with that object) or do you mean something else?

 

Hi Harrison,

Good questions, glad to answer.

On the first one, we have a background PHP process that listens for events and then passes them to the appropriate projection whenever they are received. When an aggregate is stored, it's events are pushed to a queue (Beanstalkd for local env, SQS for staging/production). The queue's PHP client waits for new messages, so it doesn't have to constantly poll. It will timeout eventually, but then you just reconnect and try again.

We use Supervisord to keep the process alive and ensure there's only one instance running.

For queues, we're planning to switch to Kafka in the near future, as it allows each projection to listen to the event queue, and keep track of it's own position, allowing them to update independently.

You could easily make these projections immediately consistent, and I'd actually recommend that for the start, while it's a single monolith, easier to manage.

On the second, that's exactly what I mean, you get the aggregate to replay it's events, building up it's internal data set, you then use this dataset to ensure you're aggregate is making valid state transitions. Eg. Can't login a user is they were never registered in the first place.

Hope the above is useful.

 

Thanks for your detailed reply, Barry. A few follow-up questions (my apologies for the lack of basic understanding they betray!) ↓

“When an aggregate is stored, it's events are pushed to a queue”

Do you also store those events in a local database? If so, how do you ensure that events are persisted and received by the queue? I could imagine it's problematic if events end up in your database but not in the queue, and it could show the user an error in the UI after the events were persisted locally but before they got onto the queue (leading the user to believe their operation failed despite that being only half true)?

“we have a background PHP process that listens for events and then passes them to the appropriate projection whenever they are received”

Could you speak a bit more about this? Is the process taking events from your local database and pushing them onto a queue, or do you mean it's receiving from your queue and routing them to projections?

If it's the former, is this something similar to Laravel's queue worker?

If the latter, why does that need to be a process that 'listens' — couldn't you just have your queues invoke the application for each event?

Hey Harrison,

Glad to answer, let's give this shot.

"When an aggregate is stored, it's events are pushed to a queue”

"Do you also store those events in a local database?"

Yes, this is a two phase operation. First the events are written to the database (or whatever storage you use for events), then they're pushed to the queue.

Now, you raise a valid point, what happens if the event is pushed to datastore, but not the queue, or vice versa (equally possible)? This is a long standing problem in any event driven system, and there are a couple of solutions.

In our implementation, writing to the datastore is transactional, once that completes the messages are sent to the queue. The messages are used to tell other systems that something has happened, they're just there to broadcast that a change has occurred, other system will read this message, then query the datastore to see what's actually happened. In other words, the "projectors" don't trust the events on the queue, they just use them as triggers for them to read events from the source of truth, the DB.

This still has the problem of "What if messages don't appear on the queue", but it becomes a problem that sorts itself out once another event appears on the queue, it'll trigger the projectors and they'll update normally.

BTW, we've never had this kind of failure. It's possible, but very unlikely. And even if it did happen, the system will handle it.

“we have a background PHP process that listens for events and then passes them to the appropriate projection whenever they are received”

"Could you speak a bit more about this?"

Yeah, you pretty got this in your exploration of an answer. It's receiving these events from a queue. In our current implementation, we have a single queue per service. Each service has a process that pulls events of the queue, in the same way as Laravel's queue workers. When an event is received, it queries the event log for the latest events. It then takes the new events, and play them into each projector.

If you'd like to discuss this further, drop me a DM on Twitter, we could arrange a skype call or something. Always glad to discuss.

 

In my experience the biggest problem with event sourced system is the problem of consistency. Events can get dropped and to make the system robust against such failure modes the design and architecture needs to account for what happens when an implicit consistency guarantee is violated because some key event in the pipeline is dropped.

I think event sourcing is a good design when it comes to building systems that need to be audited but I'm not convinced it is the right design when consistency guarantees are paramount. My experience in building event sourced systems was mostly around a massive CI pipeline that needed to integrate with various other workflows and most of our code just ended up being mitigation strategies around dropped/unaccepted events from various sources/sinks.

 

From your description, it sounds like your system was event driven (maybe a downstream event processor? fed from a message bus?), but not event-sourced. In an event sourced system, you don't lose events. It would be equivalent to losing a row in a database -- disaster recovery plans kick in.

Integration between systems is more the realm of Event-Driven Architecture. There it is totally possible to miss events or have them delivered out of order, and that is a large part of the challenge with those integrations. Events are a common concept between EDA and ES, but their uses are different.

I currently have an event sourced system which is fully consistent (between event log and read models). Mainly because I did not have time to implement the necessary extra bits to handle eventual consistency. I will add them later as needs arise. Just to say that consistency level is a choice.

 

I thought fully consistent was "more" or "sooner" consistent than eventually consistent. If you already have fully consistent, why and how to achieve eventually consistent?

By fully consistent, I mean that the event log and the read models are updated in the same transaction. Either all changes happen or not. There's no possibility of inconsistency between them.

Why go eventually consistent? To scale reads. In most systems data is read orders of magnitude more frequently than it is written. In more traditional databases, it is common to see replication employed to make read-only copies of data to handle high read loads. These are eventually consistent (the linked doc says up to 5 minutes!), although not usually called that by name.

How to go eventually consistent with event sourcing? I already have code in place to translate events into SQL statements (required for full consistency). What's still missing to go eventually consistent: 1) Moving event listeners to no longer be co-located with the write-side API. I.e. Hosting them in a different always-running service. 2) Adding checkpointing so each listener can keep track of the last event it saw, in case of restarts. 3) (Optional) A pub/sub mechanism to be notified when new events come in. Alternatively, just poll.

The pattern I use here for read models can be a template for any kind of event listener. Examples: Sending emails in response to events. Tracking a process and sending commands back into the system as stages are reached (Process Manager).

Then I just spin up as many copies of the read model service (and a corresponding database) as I need to handle my read load, and put them behind a load balancer. The read load on the event store is relatively low since each event is read just once per service. As a bonus because events are immutable, you can employ aggressive caching to avoid hitting the database after the first read of each event.

There exists a product -- Event Store -- which does a lot of this already. However there is no good way to opt into full consistency with relational storage. For our new product/team, full consistency between event log and read models saved some time to market. And I have a path for growing into eventual consistency as the need arises. We may switch to Event Store at some point.

Now that's a carefully crafted answer. Rich and accessible content even. I'll keep it in my text memo. Thanks a lot Kasey !

 

I could not have said that better myself, totally spot on.

 

I would argue that using conventional tables to store events is a good choice. Yes, it seems like it, and you wrote "append only". But indices are there too. We do have a conventional event log (not for an event-sourced system) with indices to query it and it takes hell a lot of time to query anything from there and we often get timeouts writing to it. MS SQL here, well tunes, on powerful machines. So, it is just a matter of time, until you hit this.

For event-sourced system the requirement for your store is at least to have streams. Yes, you can use tables as streams but this is it. You can probably use views but they are virtual. I mean you need real streams, like EventStore has. You can partition your events using projections, with linked events you get references to original events in new streams. This means you can do advanced indexing without paying costs to have conventional RDBMS index, which is optimised for a different purpose.

Also I would argue that having one table for all events is a good choice. Yes, it might make projections easier, having a stream per aggregate makes much more sense. Recovering aggregate from events is much easier then. Running projections to a read model would require to have a per-aggregate-type projection, which in EventStore is elegantly solved by category projections.

 

Hi Alexey,

Thank you for the excellent feedback. You raise some important points.

As you said, MySQL will work well for now (it solves the problem for the near future, 2+years), after that we've been told that MySQL will start to struggle with the log, exactly as you described. Eventstore is a solid option, we're looking into it and other technologies better suited to massive event streams.

As for the one table for all events, we've had no issues with it. Now, this doesn't mean there's one event log for ALL events, just for the events produced by a service. We're currently indexing by aggregate ID and aggregate Type, so we can easily select the subset we want. We may move to a per aggregate event store, but I'm not happy with this, as it makes it harder to change aggregate boundaries. We have metrics in place to monitor performance, so once it starts becoming problematic we'll be warned and can prepare a solution.

For projection replaying, rather than connecting to the log, we plan for the projections to connect to a copy of the log, optimised for projection reads. We're thinking of using Kafka for this. It will keep the event log indefinitely (if we want it to) and it will at least ensure ordering. This will give us more life out of our MySQL log and also speed up projection rebuilding.

 

Having the history is very nice. That is very true.

I use the traditional way: Python/Django/PostgreSQL.

Some tables (not all) have a history-trigger. This logs modification of this table to a history-table.

Thank you for exampling Event Sourcing.

I love enforced data structures and constraints.

Up to now I am not convinced yet.

 

Great post! I would like to hear your take about the following issue:

Suppose a user creates a post. The user probably expects to be redirected to this post after creation. But with eventual constistency you can't just send the user to the post page, because you won't know if it already has been created. And sending the post as a response of the POST request would infringe the CQRS model, wouldn't it?

According to Daniel Whittaker there are 4 options to solve this issue:
(1) Block user interactions, wait a certain time and try to load the post

seems unefficient to me, and how to know the posts' id?

(2) Just display a confirmation screen

bad user experience because the user expects to be redirected to the post instantly

(3) Fake the post in the UI

great idea, but what if the server has to create some data of the post model?

(4) asynchronously push data to the client via e.g. web sockets

this seems like the best approach to me, but I am not sure if this will work at scale?

So basically I would do something like this:

  • User submits creation via HTTP POST request
  • Api gateway publishes create post command
  • Post service listens to create post command and validates payload
  • Post service publishes post created event
  • WebSocket service listens to post created event and pushes created post to the client via web sockets
  • Browser updates UI (redirects to new post)

I would love to hear your opinion on how to handle the UI with eventual consistency.

 

Hi Barry.

I have quite controversial feelings about your article, it seems to be a bit opinionated, but that's a different matter.

I have two questions:

  1. you write : "It allows us to talk to the business in their language". I think this statement is badly lacking an example. How exactly does it help? My experience tells me that it's learning the domain that help to develop a language to speak with the business, not the technology we use.

  2. This one is super-practical.. What do you use as Event Store? I would like to have super-fast writes (obviously) plus at least the easy and fast ability to replay: all events, events in a certain stream, events for a certain object

 

Hi Victor,

It is definitely opinionated, the goal of the above piece was to sell people on the idea of event sourcing, so they could see the value. My further articles get into more objective discussions of event sourcing and the issues you can encounter, e.g. dev.to/barryosull/immediate-vs-eve... and dev.to/barryosull/event-granularit....

As for your questions:

1 . you write : "It allows us to talk to the business in their language".

I think the article explains what I mean quite well. By speaking in terms of events you are actually modelling the domain language. Technology never enters the conversation, only the language used to describe important business state changes, i.e. events. That's how the business owners think about the domain and ES allows you to express that in a way that both developers and domain experts can conceptualise. ES isn't a technology, it's a technique, and one that is used in many mature domains (i.e. law and accounting).

2 . This one is super-practical. What do you use as Event Store?

This one is easier to answer. I use MySQL as the event store, as RDBMSs are optimised for append only operations (so super fast writes). I index by the aggregate ID as well, so that it's easy to fetch an aggreate's event stream, and also to lock and ensure only one aggregate instance is updated at a time (aggregates need to be immediately consistent). For projections I would suggest Kafka, it's great for creating and aggregating streams of events, making it easy to spin up projections on the fly (eventually consistent). You could also use Redis streams, but I don't have experience using that technology so I can't give advice, if the idea appeals though here's a list of the pros and cons: logz.io/blog/kafka-vs-redis/

 

Thanks for a quick reply, Barry!

I also was thinking about RDBMS as event store. MySQL has great performance, and with it's support for JSON datatype it becomes even easier to store arbitrary data. With metadata stored in regular columns (like created_at TIMESTAMP, event_type VARCHAR, aggregate_id INT) you also get lots of abilities to SELECT only required events in proper order. With indexes on metadata columns, SELECTs become fast (although it goes at a price of slightly degraded WRITE performance). Having atomicity in place is another big benefit of RDBMS.

As of business language: I literally think it's a matter of communication skills and domain knowledge spread within the team. At least it's much more about that that about technology in use. I've been talking to business leaving tech questions aside for years (and, well, that means, years before I first heard about ES)

 

Hi Barry,

Typos ?

for well small apps

Infact,

The is usually

benefits all ready

to be eventually consistency

Thanks a lot for sharing that with us !

I shall look for some examples or books on how to get my hands dirty.

Cheers,

Stephane

 

Good write-up Barry. We did that sort of things in healthcare.. patient get registered, the patient is admitted, diagnosed, discharged etc events. HL7 in healthcare is built on event model and I love it.

Most enterprise systems built with a concept that of events are generated after data is written to the relational data model. The reason being they have been using these events primarily for system integration ( messaging ).

On other hands, event sourcing builds a data model based on event series and payload structure.

Do you agree?

 

I definitely do. The status quo in enterprise is to write to a relational model, then broadcast events based on those changes. In effect, the events are projections of the relational data, which in my mind is putting the cart before the horse. This is a flavour of CQRS, and is considered a stepping stone in migrating to an Event Sourced system, rather than the end result.

Greg Young talks about it during this talk.

I didn't know HL7 was message based, I'll have to read up on it more, thanks for that!

 

Very nice article ! Was thinking about exploring the whole CQRS system but I was wondering how you manage a big app evolving that way, and needing more and more migrations to become "on par" with live releases from the event-0 beginning ?

 

I'm not sure not sure what you mean, so it's hard to answer.

In terms of migrations, EventSourced apps use migrations for the database backed projections, in the same way regular apps do. If the schema changes shape, you create a migration. So as your systems evolves, you'll get more and more migrations, like any standard web app.

Hope that helps, if it's not on the right track, let me know and I'll do my best to answer.

 

This is a nice article but I think there's a slight mix between CQRS & Event Sourcing principles. Of course both are working pretty well together but characteristics like eventual consistency comes from CQRS, not Event Sourcing.
Nice article though.

 

Nice write up. I've a 3 part series of blog posts around Event Sourcing + CQRS with some code samples (in .net) for anyone keen. Part one covers the introduction which was pretty much covered in this post. Part 2 is where the code samples are. dasith.me/2016/12/31/event-sourcin...

 

Yes! Great write up! I'm a huge fan of event sourced architectures, especially how well they can be used in serverless architecture. Keep spreading the word!

 
 

Nice job! Really great article.

For those interested in event sourcing I think it's also worth looking at the article by Fowler: martinfowler.com/eaaDev/EventSourc...

Thanks for sharing!

 

Sounds very much like taking a programming style similar to Elm applications to the backend, I've been thinking about it for a while as well, particularly for the possibility of the "time-travel debugging" type of things.

youtube.com/watch?v=RUeLd7T7Xi4

 

I'm just hearing about this for the first time and would love to give it a shot. What would be your recommended source to learn about Event Sourcing?

 

Id' start with this talk from Greg Young on the concept, give some more details on it.
youtube.com/watch?v=JHGkaShoyNs

Then I'd follow it up with this, it's a site that goes through all the technical concepts of ES (as well as DDD) and gives a solid breakdown of each concept. I'd start with the FAQ, as a primer, then read the rest.
cqrs.nu/Faq

I still visit this site.

 

I'm an editor at InfoQ.com.cn,May I translate your post into Chinese for appropriate credit?

 
 

Very good article. Read so many about ES but this one nailed it. So far the best approach!

code of conduct - report abuse