Jeremy W. Sherman

stay a while, and listen

Clarified CQRS - Reading Notes

On 19 Dec 2013, I read the article Clarified CQRS published by Udi Dahan on 9 Dec 2009, so four years ago.

(This reading was for the BNR Book Club: It’s open to all, and you should join the group!)

In it, Dahan elaborates their interpretation of CQRS.

Dahan’s new ideas:

  • Data inevitably stales: Exploit this instead of fighting it.
  • Each command can, and should, be processed autonomously from the others.
  • DRY to the max by jettisoning code and data store complexity wherever the command–query system allows it, which is far more than you might think at first blush.

Section-by-section notes follow.

Two appendices include notes from additional articles by other authors on CQRS, to provide context to the discussion:

  • Martin Fowler provides a cogent summary and links to references.
  • Greg Young originated CQRS and distills it to its essence in a single example.

Clarified CQRS

Why CQRS

  • Driven by:
    • Collaboration: Mutable state shared by >1 actor
    • Staleness: Data read by an actor can be invalidated by a subsequent write by another actor.
      • Exacerbated by caching.
      • Leads users inevitably to act based on obsolete data.

Queries

  • Data is going to be stale. Give in and skip DB hits.
  • Cache data in view-model format to avoid unnecessary marshaling.
    • No need for the cache to be a RDBMS - ViewModels don’t require any joins, they’re already denormalized.
  • Views hit the cache rather than the DB for their display data.

Scaling

  • Add multiple caches.
  • Don’t worry about keeping them in sync across each other. Users just encounter different vintages of stale data in different caches.

Data Modification

  • Optimistic concurrency conflicts.
  • Validation is a pain.
  • End up rejecting a whole chunk of modifications because 1 is off, then users must redo their work based on the new data.
  • More users, bigger entities => more frequent and annoying conflicts.
  • Solved by commands:

    If only there was some way for our users to provide us with the right level of granularity and intent when modifying data. That’s what commands are all about.

Commands

  • “Using an Excel-like UI for data changes doesn’t capture intent, as we saw above.”
    • Submit commands instead of a “write these fields” instead. (Task-based UI)
    • Capture intention - can even process asynchronously, report progress and failures in UI, let user investigate why failed.
  • “Note that the client sends commands to the server – it doesn’t publish them. Publishing is reserved for events which state a fact – that something has happened, and that the publisher has no concern about what receivers of that event do with it.”

Commands & Validation

Validation is different from business rules in that it states a context-independent fact about a command. Either a command is valid, or it isn’t. Business rules on the other hand are context dependent.

Returns to example of delinquency update arriving before preferred status application causing the latter to be reject; reverse the order, and we would have accepted both changes.

This is basically just pointing out that “a valid command is one that has all necessary, valid data” rather than “a valid command is one that won’t be rejected”.

Rethinking UIs and commands

Can use query store to speed up updates - autocomplete from query store, update sends the ID we already have for the selected value rather than text. Again, less marshaling.

Reasons valid commands fail

The delinquent vs. preferred race is just bad design. Should have same business outcome regardless of which arrives first.

Outcome: Notify the user (email).

No rejection errors are ever returned to the agent submitting changes. They can do nothing but notify the user, anyway.

No need even to show pending commands: Instead, notify users as needed asynchronously out of band.

Commands and Autonomy

  • Command processor should be autonomous.
  • Queue commands for processing, process them at leisure, rollback and retry as needed (DB down frex).
  • Serving commands and queries from separate stores prevents cache thrashing.

Autonomous Components

Acronym: AC = Autonomous Component

Command processor is an AC with its own queue.

Can go even further than that: Can have each command processed by its own AC.

This lets you get detailed queue and processing time metrics, and can scale up ACs on a per-command basis.

Service Layers

Per-command AC means each processor is independent. This is a stark contrast to the rat’s nest at each layer of many layered architectures.

Domain Model

Domain model is no longer used to service queries.

Not really necessary for commands either.

Scarcely need relationships – just precompute (denormalize) for queries, and have commands sent with needed IDs.

Persistence for Command Processing

No need for fancy DB queries.

Commands come in with IDs anyway.

So ORM not strictly necessary; can do key-value, optionally splitting out properties that benefit from uniqueness constraint into their own columns.

Key point here: “How you process the commands is an implementation detail of CQRS.”

Keeping the Query Store in Sync

  • Apply command and broadcast event in transaction.
  • Per-command events - DoBlah broadcasts DidBlah on success.
  • AC does Event -> Query Store (cache) updates.
    • Can readily do one AC per ViewModel (aka table).

Bounded Contexts

“CQRS if used is employed within a bounded context (DDD) or a business component (SOA) – a cohesive piece of the problem domain. The events published by one BC are subscribed to by other BCs, each updating their query and command data stores as needed.”

Mash-up into a single UI as needed.

Summary

CQRS is about coming up with an appropriate architecture for multi-user collaborative applications. It explicitly takes into account factors like data staleness and volatility and exploits those characteristics for creating simpler and more scalable constructs.

One cannot truly enjoy the benefits of CQRS without considering the user-interface, making it capture user intent explicitly. When taking into account client-side validation, command structures may be somewhat adjusted. Thinking through the order in which commands and events are processed can lead to notification patterns which make returning errors unnecessary.

Appendix A: Martin Fowler on CQRS

Never expanded anywhere in the article is the acronym “CQRS”:

CQRS stands for Command Query Responsibility Segregation. It’s a pattern that I first heard described by Greg Young. At its heart is a simple notion that you can use a different model to update information than the model you use to read information. This simple notion leads to some profound consequences for the design of information systems.

[…]

The change that CQRS introduces is to split that conceptual model [integrating various views of the underlying data] into separate models for update and display, which it refers to as Command and Query respectively following the vocabulary of CommandQuerySeparation. The rationale is that for many problems, particularly in more complicated domains, having the same conceptual model for commands and queries leads to a more complex model that does neither well. (Martin Fowler)

Fowler introduces more terms, like Reporting Database and Eager Read Derivation, which can be used independently of CQRS but feature in it as well.

Points out that where CRUD fits, you should likely use it. CQRS should also be deployed on a per-“bounded context” basis - it’s effectively a domain modeling decision.

Not clear that commands and queries are often really separate enough that it’s worth having two entirely separate models.

CQRS is nice for high-load apps – you can scale reads and writes independently. But still can handle this in CRUD by splitting out the really high reads into a ReportingDatabase used to serve just those queries.

Appendix B: Greg Young on CQRS

Greg Young originated CQRS per Fowler.

Fowler links to this summary by Greg Young:

  • Split CustomerService into CustomerReadService and CustomerWriteService. Boom: CQRS.
  • “[No biggie, eh? But!] This separation however enables us to do many interesting things architecturally, the largest is that it forces a break of the mental retardation that because the two use the same data they should also use the same data model.
  • “There is however one thing that does really require a task based UI… That is Domain Driven Design.”
  • “The Application Service Layer in Domain Driven Design represents the tasks the system can perform. It does not just copy data to domain objects and save them… It should be dealing with behaviors on the objects”
    • Don’t use DDD for areas where CRUD really is the “ubiquitous language”.
  • Conclusion:

    Going through all of these we can see that CQRS itself is actually a fairly trivial pattern. What is interesting around CQRS is not CQRS itself but the architectural properties in the integration of the two services. In other words the interesting stuff is not really the CQRS pattern itself but in the architectural decisions that can be made around it. Don’t get me wrong there are a lot of interesting decisions that can be made around a system that has had CQRS applied … just don’t confuse all of those architectural decisions with CQRS itself.