Discussion on: A Deconstruction of CQRS

View post

Replies for: Hi Kasey, sorry for being late, this article was bookmarked ages ago and I got around it only now :-) Great explanation. I have a few questions......

Hey rhymes! No worries about being late. Hopefully the ideas I have written here will be relevant for more than just a couple of months. So I won't consider comments late until an obviously better idea comes along. :)

I can definitely see the parallel you mention with REST and GraphQL. It has occurred to me as well. However, I have trouble putting these under the umbrella of CQRS. I think the main difficulty is that those patterns seem to be very entity-focused. In my mind CQRS is more like modeling an API as an Actor listening for messages, plus adapting the CQS pattern from Bertrand Meyer to those messages. Perhaps unintentionally, REST and GraphQL seem to intuitively lead devs to organize read and write concerns together under a single structure (an entity) and then exposing that implementation detail to the outside. Despite the semantic advantage from "GET does not make changes", this organization practice still impedes other benefits I will mention below.

Meyer's Command-Query Separation pattern was originally meant to refer to methods on objects, but it has distinct benefits for operations at the API level. As a dev maintaining the API, I find a lot of value in the division of query and command. It helps me to organize read concerns (such as full text indexes, search lists, reports) more appropriately and separately from the write concerns (domain model). For example, I end up using completely different code paths for the query and command sides. It is especially nice to isolate the command side to just be: "Here is the request. Go consult whatever data you need, then tell me your decision." Then let other parties be concerned with saving that decision into specific formats (e.g. SQL), updating full-text search, and so forth.

It also offers architectural flexibility -- you can choose to split read and write sides up into different services to address drastically different performance/distribution characteristics. For example, you could have a highly available, centrally-located write service but use geo-replicated Elastic Search and data replicas, each with their own local query service. Or you could do the inverse if that fit your problem space: fully distributed writes with centralized read aggregations. (For example, distributed sensor networks.) Whereas it seems at odds with an entity-based organization to say that GETs should go to a different service than PUT/POST/DELETE. I have only a passing familiarity with GraphQL, so I cannot offer a direct comparison there... only that it still seems to be about entities. (And Datomic too for that matter.)

Regarding auditability. CQRS sortof assumes messaging (Command and Query messages). The great thing about messages is that they are just data. So they can be logged, filtered, routed, stored for later, etc.; whatever you need. However, there is a danger in expecting the saved Commands to reproduce the current state of the system. There is a strong parallel to saving the SQL statements that you ran, then attempting to regenerate the database with them. Assuming you ignore rejected SQL statements, columns and relationships can still change over time, so old statements may break or work differently over the life of the system. This is why SQL itself does not store statements as the source of truth. Instead, it uses the transaction log for that purpose. If you saved the SQL statements as they happen, you may not be able to rebuild the database. But if you had the whole transaction log, you could.

So if you want to be able to rebuild the state of the system from the audit log, this is where event sourcing comes into prominence. Commands are the requests that were made of the system, like SQL statements. (Some systems also log these for regression testing; to ensure that the same command generates the same events.) But events are the actual changes made to the system, like the SQL transaction log. Command behavior can evolve over time, but events actually happened, so they do not change and must always be handled properly as code changes. Consider this business conversation: "We decided that after X date, users who signup will not be considered founders anymore. But we still must provide founder features for those who signed up before X." The signup events that happen now are different from before, but the old ones still matter.

Regarding KSUIDs. That was the first time I heard of them. Thanks for the link; I like to learn about new things.

There would be some challenges for me to use them currently. The tools (languages, libraries, databases) I use have support for UUIDs, but do not yet support KSUIDs. And so far, I don't really have a use case which needs the creation date integrated into the ID. (I have date as a separate field if I need to order by that.) I could see it improving write performance because indexes wouldn't have to be reordered so much on insertion vs random UUIDs. I will keep KSUIDs in mind in case a situation arises that fits. Thanks for mentioning them!

Don't get me started on frameworks. 🤐

rhymes • Nov 6 '18

Thanks for the really detailed explanation.

I dig the explicit separation, frameworks don't go much past the MVC in their recommendations

What I like about KSUIDs is the fact they are sortable so as you say they should have less impact on insertion. Haven't measured it though