DEV Community

Cover image for A Deconstruction of CQRS

A Deconstruction of CQRS

Kasey Speakman on September 17, 2018

The cover image is copyright Fabian Oefner from the Disintegrating II series. This is one of my favorite cars, the Ford GT40. In earlier days of CQ...
Collapse
 
pim profile image
Pim • Edited

"Commands are the gatekeepers of change. Queries are the library of knowledge." Just a beautiful summation of CQRS. Would've loved more code samples, but otherwise amazing article.

Collapse
 
kspeakman profile image
Kasey Speakman • Edited

Thanks!

Regarding code samples, is there anything you were specifically interested in seeing? I may post a follow-up if there is interest. It is my intention to one day publish a template for our style of API (which also includes event sourcing). I hope there is nothing especially ground-breaking in there. Maybe just organization patterns that people had not considered. (It may just be the way my brain works, but I find organization to be the hardest part of every solution.)

Collapse
 
pim profile image
Pim

No sweat! I think these situations through in code, so examples always help me digest. Even if I already understand the situation, like in this case, I still enjoy watching someone else's perspective. Plus when a saw a notif for a post by you, I was like "dayummmmmm time to see some of that sweet Speakman f#!"

Collapse
 
rafalpienkowski profile image
Rafal Pienkowski

Great article. It was a pleasure to read it. I like the translation of the Entity to the DDD context.

One question about the Pre-generated ID adventure paragraph. You wrote:

"An ID was generated (or requested from the server) when the form was loaded, before the user even started typing anything."

So my question is why don't generate this Id on the server side after the submission of a form? You've chosen the different strategy, and that is fine for me because solved your problem :). I'm just curious if you've considered ID generation before sending a command to the external server? Or that was only a think javascript client, and it could be overengineering to do that?

Collapse
 
kspeakman profile image
Kasey Speakman • Edited

Thanks for the comment. :)

Regarding ID generation, I might not be understanding your questions entirely, but I will try to answer.

We always generate UUIDs on the server currently. I have trust issues with some browser implementations of random in JS. However, for other kinds of apps I would be comfortable with generating UUIDs on the client.

We generate the ID when the form is loaded. For web clients, we normally request data from the server for creation forms anyway -- to populate drop-down lists for example. So in the query response from the server, it will include an newly generated UUID for convenience.

The client knowing the ID before the form is submitted is valuable for a couple of reasons. Retries without duplicates were mentioned in the article. But also when the operation succeeds, the client already has the ID to be able to provide access links to the user (or query a hypermedia endpoint with the ID, if that's your thing). This frees the client from having to interpret the success of the command -- it's basically just a "yes" or "no, and here's why". Generating on the server after submission does not have these benefits.

We have done ID generation (well, fetching an ID from server) just before submitting the creation command. That was our first try at it. We didn't like it because it was a little more awkward. For example, handling a retry is slightly different from a first try. Also if there was a network congestion problem near the time of submission, it is likely to affect fetching an ID as well, so an extra error step to handle. These problems would be easily tamed if we were generating the ID locally. But at that point, the difference between generating the ID when the form loads vs just before submission is negligible (with UUIDs anyway). For me it makes more sense to do it up front so the form submission process has less decision branches.

Let me know if you have any more questions or I missed something.

Edit: I meant to add that we also use secondary IDs. These are human-friendly (and in our particular app, usually human-entered) IDs. Items can be searched by secondary IDs.

Collapse
 
rafalpienkowski profile image
Rafal Pienkowski

Thanks a lot for the explanation. You answered my question in 100%. I'll do my best to make more clear question next time :)

I've heard that in DDD there could be a dedicated service which is responsible for ID creation.

Thanks a lot for your replay.

Thread Thread
 
kspeakman profile image
Kasey Speakman • Edited

No worries. It seems I did understand your questions after all. :)

There are many strategies for ID generation, including ones where an ID server is employed. With UUIDs, there is no extra infrastructure needed. However, (although it is extremely unlikely) it is still possible to have UUID collisions. For me the trade-off is well worth making, but it depends on the situation.

Thread Thread
 
rafalpienkowski profile image
Rafal Pienkowski

Yes, you did :)

Collapse
 
samuelfrancisco profile image
Samuel Francisco

"It is okay for command handling code to call queries."

I think this assertion is not always true. When there is eventual consistency between the command side and the query side, for some use cases is not acceptable that the command side uses stale information to take decisions.

Collapse
 
kspeakman profile image
Kasey Speakman

If you had a strong consistency requirement for some bit of information, then you probably would not make the view on the information eventually consistent anyway. So it still works out fine to call queries from the command side.

In first approaching CQRS/ES, I thought in the same way. I devised plans to keep some fully-consistent state on the command side separate from eventually views. However, after having it in production a bit, it turned out to be concerning myself (and wasting time) for no reason. Because life and business is eventually consistent (everyone is not immediately informed of everything that happens). So in particularly important areas of the business, they will already have procedures to deal with this reality. If they don't then it probably doesn't really matter. You shoot yourself in the foot asking the customer deeply technical questions like "Full consistency?" when they don't understand the trade-offs. (They are always going to say yes to this question, even if they don't need it.) In any case, modeling the software after the way the business actually works finds the right path.

Collapse
 
samuelfrancisco profile image
Samuel Francisco

"If you had a strong consistency requirement for some bit of information, then you probably would not make the view on the information eventually consistent anyway. So it still works out fine to call queries from the command side."

I don't think this statement covers all possibilities. I think is not possible affirm that ever an system decision requires "strong consistency" any display of the same data requires the same. I think this is an point of attention when a decision is made to use the query side on the command side. Understand if the information is "eventually consistent" and if this is acceptable to that use case.

"Because life and business is eventually consistent... In any case, modeling the software after the way the business actually works finds the right path."

I think that's a weak argument. Systems are not only made to mimic real-life and business processes, but to improve them.

Anyway, I am only saying that in some cases the statement "It is okay for command handling code to call queries." is not true. Sometimes an "special view" without eventual consistency must be constructed before the command handle can call the queries.

Thread Thread
 
kspeakman profile image
Kasey Speakman • Edited

Thanks for your comments.

I think there has been a misunderstanding. The phrase "is okay" does not mean "always and only ever use this". Neither is the word "probably" a universal absolute. The text underneath the "is okay" explains that people intuitively view this as a taboo, but it is permissible. I think the reason for the taboo comes from the combination of other patterns such as DDD, ES, and of course EC. But even there it is still permissible to query certain kinds of information from the command side.

Your objection primarily seems to be the specific case of Querying eventually consistent data from the Command side. And in fact, this is still a perfectly valid way (among other possibilities) to get information. I will give you our use cases for example. Configuration data (to control a process) and set validation (unique keys) are our only queries from the command side (each use case also reads from an event stream). For our case, it makes no difference if the configuration data is eventually consistent. When the user makes a configuration change, they are expecting that some processes were executing before they made the change, and some will be executed after. They probably won't notice or care if milliseconds of eventual consistency allowed someone to start a process with the old configuration. Set validation (unique key constraint) is ideally fully consistent, but for us the risk of violation is low enough (they are user-entered with front-end checks and API-side checks) that we are okay with spending admin time to fix it if it ever happens. In our situation it is fine to use these queries with EC. Use your best judgement for your situation.

Here is Greg Young's 8-years-old post on the subject.

Collapse
 
rhymes profile image
rhymes

Hi Kasey, sorry for being late, this article was bookmarked ages ago and I got around it only now :-)

Great explanation. I have a few questions...

  • isn't CQRS also a generalization of REST's concepts around separation between read and write methods? GET queries the system, the other ones alter it. Same goes for GraphQL's separation between queries and mutations.

  • you didn't talk about it in this intro but the generalization made by CQRS seems a boon for tracing, logging and repeatability. If queries and commands are correctly separated in theory you could save the state of the system in a "log file"

  • have you considered using KSUIDs instead of UUIDs?

I like the idea of the client knowing the ID of the object in advance, unfortunately many frameworks people use default to primary IDs or database created UUIDs. This is easily solvable though

Collapse
 
kspeakman profile image
Kasey Speakman • Edited

Hey rhymes! No worries about being late. Hopefully the ideas I have written here will be relevant for more than just a couple of months. So I won't consider comments late until an obviously better idea comes along. :)

I can definitely see the parallel you mention with REST and GraphQL. It has occurred to me as well. However, I have trouble putting these under the umbrella of CQRS. I think the main difficulty is that those patterns seem to be very entity-focused. In my mind CQRS is more like modeling an API as an Actor listening for messages, plus adapting the CQS pattern from Bertrand Meyer to those messages. Perhaps unintentionally, REST and GraphQL seem to intuitively lead devs to organize read and write concerns together under a single structure (an entity) and then exposing that implementation detail to the outside. Despite the semantic advantage from "GET does not make changes", this organization practice still impedes other benefits I will mention below.

Meyer's Command-Query Separation pattern was originally meant to refer to methods on objects, but it has distinct benefits for operations at the API level. As a dev maintaining the API, I find a lot of value in the division of query and command. It helps me to organize read concerns (such as full text indexes, search lists, reports) more appropriately and separately from the write concerns (domain model). For example, I end up using completely different code paths for the query and command sides. It is especially nice to isolate the command side to just be: "Here is the request. Go consult whatever data you need, then tell me your decision." Then let other parties be concerned with saving that decision into specific formats (e.g. SQL), updating full-text search, and so forth.

It also offers architectural flexibility -- you can choose to split read and write sides up into different services to address drastically different performance/distribution characteristics. For example, you could have a highly available, centrally-located write service but use geo-replicated Elastic Search and data replicas, each with their own local query service. Or you could do the inverse if that fit your problem space: fully distributed writes with centralized read aggregations. (For example, distributed sensor networks.) Whereas it seems at odds with an entity-based organization to say that GETs should go to a different service than PUT/POST/DELETE. I have only a passing familiarity with GraphQL, so I cannot offer a direct comparison there... only that it still seems to be about entities. (And Datomic too for that matter.)

Regarding auditability. CQRS sortof assumes messaging (Command and Query messages). The great thing about messages is that they are just data. So they can be logged, filtered, routed, stored for later, etc.; whatever you need. However, there is a danger in expecting the saved Commands to reproduce the current state of the system. There is a strong parallel to saving the SQL statements that you ran, then attempting to regenerate the database with them. Assuming you ignore rejected SQL statements, columns and relationships can still change over time, so old statements may break or work differently over the life of the system. This is why SQL itself does not store statements as the source of truth. Instead, it uses the transaction log for that purpose. If you saved the SQL statements as they happen, you may not be able to rebuild the database. But if you had the whole transaction log, you could.

So if you want to be able to rebuild the state of the system from the audit log, this is where event sourcing comes into prominence. Commands are the requests that were made of the system, like SQL statements. (Some systems also log these for regression testing; to ensure that the same command generates the same events.) But events are the actual changes made to the system, like the SQL transaction log. Command behavior can evolve over time, but events actually happened, so they do not change and must always be handled properly as code changes. Consider this business conversation: "We decided that after X date, users who signup will not be considered founders anymore. But we still must provide founder features for those who signed up before X." The signup events that happen now are different from before, but the old ones still matter.

Regarding KSUIDs. That was the first time I heard of them. Thanks for the link; I like to learn about new things.

There would be some challenges for me to use them currently. The tools (languages, libraries, databases) I use have support for UUIDs, but do not yet support KSUIDs. And so far, I don't really have a use case which needs the creation date integrated into the ID. (I have date as a separate field if I need to order by that.) I could see it improving write performance because indexes wouldn't have to be reordered so much on insertion vs random UUIDs. I will keep KSUIDs in mind in case a situation arises that fits. Thanks for mentioning them!

Don't get me started on frameworks. 🤐

Collapse
 
rhymes profile image
rhymes

Thanks for the really detailed explanation.

I dig the explicit separation, frameworks don't go much past the MVC in their recommendations

What I like about KSUIDs is the fact they are sortable so as you say they should have less impact on insertion. Haven't measured it though

Collapse
 
jeastham1993 profile image
James Eastham

"Commands are the gatekeepers of change. Queries are the library of knowledge."

Absolutely love this comment Kasey! And a fantastic all-round article.