CQRS: ACID on Read-Side

592 Views Asked by At

I often read that one great advantage of CQRS is having denormalized data on the read-side. E.g. one can store data fields and child objects redundant to avoid joins. But that also means a single event might result in multiple update operations on the read-side, as a state change to an entity must be reflected in multiple read-side places.

Doesn't that imply you need atomic transaction support on the read-side? Many NoSQL databases only support transactions within one collection or one partition.

How do CQRS-based systems deal with that in real life?

  • Do they use SQL databases on the read-side?
  • Do they use non-redundant data only? Then, why having a read-store in the first place?
  • Are there fancy database technologies supporting atomic transactions in the cloud?
2

There are 2 best solutions below

2
On

Doesn't that imply you need atomic transaction support on the read-side?

No, not necessarily but you need to protect against concurrent updates.

Do they use SQL databases on the read-side?

In my latest project I use MongoDB for most of my read-models. This database has no transaction support but I don't need one because I use optimistic locking. I wrote an article here about this, with an example in PHP. The idea is that I detect concurrent updates by using a version (as an unique index) and retry the last one.

Do they use non-redundant data only? Then, why having a read-store in the first place?

I use full data denormalization. That means that I have in one document all the information I need for a particular view.

0
On

CQRS systems tend to be based on the concept of eventual consistency (although this isn't essential - the alternative is to use transactions and an ACID DB, if scalability isn't a major concern). With eventual consistency, after a command has executed, queries to any particular view might see:

  • the pre-command state
  • the post-command state

Eventually, all views will be updated with the post-command state. How long this takes depends on load and scaling, software upgrade deployment approach, etc. Your requirements might need a maximum delay for particular read-sides, which obviously affects the engineering approach you take to software updates, etc.

Eventual consistency has implications for client code. Commonly, each read side 'row' (document/row etc. depending on the kind of DB) will include the latest version numbers of applicable aggregates used to build it. This means a client can poll the read-side query service for the row it expects to be updated until the aggregate version version comes back incremented, then it knows the query result includes changes from the command it sent. (Alternatives to polling include web-sockets - the read side code needs to fire a notification for each update).

Dealing with concurrency is another ACID related concern - options here include:

  • Optimistic concurrency
  • Single thread and process handling updating each read side
  • Partitioning 'rows' on the read side, with each 'row' only updated by a single process + thread
  • Any other way of ensuring only one 'row' is updated at a time