Although I've come across Kafka before, I just recently realized Kafka may perhaps be used as (the basis of) a CQRS, eventstore.
One of the main points that Kafka supports:
- Event capturing/storing, all HA of course.
- Pub/sub architecture
- Ability to replay the event log which allows the ability for new subscribers to register with the system after the fact.
Admittedly I'm not 100% versed in CQRS / Event sourcing but this seems pretty close to what an events tore should be. The funny thing is: I really can't find that much about Kafka being used as an event store, so perhaps I am missing something.
So, is anything missing from Kafka for it to be a good event store? Would it work? Using its production? Interested in insight, links, etc?
Basically, the state of the system is saved based on the transactions/events the system has ever received, instead of just saving the current state/snapshot of the system which is what is usually done. (Think of it as a General Ledger in Accounting: all transactions ultimately add up to the final state) This allows all kinds of cool things, but just read up on the links provided.
Kafka is meant to be a messaging system which has many similarities to an event store however to quote their intro:
So while messages can potentially be retained indefinitely, the expectation is that they will be deleted. This doesn't mean you can't use this as an event store, but it may be better to use something else. Take a look at EventStoreDB for an alternative.
UPDATE
Kafka documentation:
UPDATE 2
One concern with using Kafka for event sourcing is the number of required topics. Typically in event sourcing, there is a stream (topic) of events per entity (such as user, product, etc). This way, the current state of an entity can be reconstituted by re-applying all events in the stream. Each Kafka topic consists of one or more partitions and each partition is stored as a directory on the file system. There will also be pressure from ZooKeeper as the number of znodes increases.