CQRS Aggregate and Projection consistency

976 Views Asked by At

Aggregate can use View this fact is described in Vaughn Vernon's book:

Such Read Model Projections are frequently used to expose information to various clients (such as desktop and Web user interfaces), but they are also quite useful for sharing information between Bounded Contexts and their Aggregates. Consider the scenario where an Invoice Aggregate needs some Customer information (for example, name, billing address, and tax ID) in order to calculate and prepare a proper Invoice. We can capture this information in an easy-to-consume form via CustomerBillingProjection, which will create and maintain an exclusive instance of CustomerBilling-View. This Read Model is available to the Invoice Aggregate through the Domain Service named IProvideCustomerBillingInformation. Under the covers this Domain Service just queries the document store for the appropriate instance of the CustomerBillingView

Let's imagine our application should allow to create many users, but with unique names. Commands/Events flow:

  • CreateUser{Alice} command sent
  • UserAggregate checks UsersListView, since there are no users with name Alice, aggregate decides to create user and publish event.
  • UserCreated{Alice} event published // By UserAggregate
  • UsersListProjection processed UserCreated{Alice} // for simplicity let's think UsersListProjection just accumulates users names if receives UserCreated event.
  • CreateUser{Bob} command sent
  • UserAggregate checks UsersListView, since there are no users with name Bob, aggregate decides to create user and publish event.
  • UserCreated{Bob} event published // By UserAggregate
  • CreateUser{Bob} command sent
  • UserAggregate checks UsersListView, since there are no users with name Bob, aggregate decides to create user and publish event.
  • UsersListProjection processed UserCreated{Bob} .
  • UsersListProjection processed UserCreated{Bob} .

The problem is - UsersListProjection did not have time to process event and contains irrelevant data, aggregate used this irrelevant data. As result - 2 users with the same name created.

how to avoid such situations? how to make aggregates and projections consistent?

2

There are 2 best solutions below

0
On

how to make aggregates and projections consistent?

In the common case, we don't. Projections are consistent with the aggregate at some time in the past, but do not necessarily have all of the latest updates. That's part of the point: we give up "immediate consistency" in exchange for other (higher leverage) benefits.

The duplication that you refer to is usually solved a different way: by using conditional writes to the book of record.

In your example, we would normally design the system so that the second attempt to write Bob to our data store would fail because conflict. Also, we prevent duplicates from propagating by ensuring that the write to the data store happens-before any events are made visible.

What this gives us, in effect, is a "first writer wins" write strategy. The writer that loses the data race has to retry/fail/etc.

(As a rule, this depends on the idea that both attempts to create Bob write that information to the same place, using the same locks.)

A common design to reduce the probability of conflict is to NOT use the "read model" of the aggregate itself, but to instead use its own data in the data store. That doesn't necessarily eliminate all data races, but you reduce the width of the window.

Finally, we fall back on Memories, Guesses and Apologies.

0
On

It's important to remember in CQRS that every write model is also a read model for the reads that are required to validate a command. Those reads are:

  • checking for the existence of an aggregate with a particular ID
  • loading the latest version of an entire aggregate

In general a CQRS/ES implementation will provide that read model for you. The particulars of how that's implemented will depend on the implementation.

Those are the only reads a command-handler ever needs to perform, and if a query can be answered with no more than those reads, the query can be expressed as a command (e.g. GetUserByName{Alice}) which when handled does not emit events. The benefit of such read-only commands is that they can be strongly consistent because they are limited to a single aggregate. Not all queries, of course, can be expressed this way, and if the query can tolerate eventual consistency, it may not be worth paying the coordination tax for strong consistency that you typically pay by making it a read-only command. (Command handling limited to a single aggregate is generally strongly consistent, but there are cases, e.g. when the events form a CRDT and an aggregate can live in multiple datacenters where even that consistency is loosened).

So with that in mind:

  • CreateUser{Alice} received
  • user Alice does not exist
  • persist UserCreated{Alice}
  • CreateUser{Alice} acknowledged (e.g. HTTP 200, ack to *MQ, Kafka offset commit)
  • UserListProjection updated from UserCreated{Alice}
  • CreateUser{Bob} received
  • user Bob does not exist
  • persist UserCreated{Bob}
  • CreateUser{Bob} acknowledged
  • CreateUser{Bob} received
  • user Bob already exists
  • command-handler for an existing user rejects the command and persists no events (it may log that an attempt to create a duplicate user was made)
  • CreateUser{Bob} ack'd with failure (e.g. HTTP 401, ack to *MQ, Kafka offset commit)
  • UserListProjection updated from UserCreated{Bob}

Note that while the UserListProjection can answer the question "does this user exist?", the fact that the write-side can also (and more consistently) answer that question does not in and of itself make that projection superfluous. UserListProjection can also answer questions like "who are all of the users?" or "which users have two consecutive vowels in their name?" which the write-side cannot answer.