Kafka Pub/Sub: How to raise an event once a group of events have been raised

25 Views Asked by At

Imagine a simple event driven system consisting of;

Microservice A that processes received documents;

  • Receives multi-page document
  • Processes each page of the document
  • Produces a "Page Processed" event for each page

Microservice B that runs additional processing on pages;

  • Consumes "Page Processed" events
  • Runs some additional processing on the page if required
  • Produces "Page Processed Some More" events

Microservice C needs to run additional processing after all events related to a document have been successfully produced, ideally by consuming some kind of "Document Processed" event.

The problem is, neither Microservice A nor Microservice B know when all pages have been fully processed and in future more microservices could be created to do additional processing on the pages.

I haven't tried anything in practise yet as I am still looking at design of the system, however, I have thought about a few possible solutions though I dislike all of them so I'm looking for a better way or at least what is the recommended approach.

  1. Microservice A could produce another event at the start detailing the number of pages the document has, Microservice C can subscribe to that event and then consume all the Page Processed and Page Processed Some More events it expects, essentially counting them up.
  2. Microservice D (new service) could do the event counting similar to the above and produce the "Document Processed" event
  3. Process the pages in serial fashion so each page triggers the next page to be processed and at the end of all processing when no pages are left a "Document Processed" event is raised

Solution 1 I dislike because it means Microservice C now needs to know about events it shouldn't care about.

Solution 2 I dislike because it means the overhead of an extra service when I feel like this is a common scenario that there must be a better solution for.

Both solution 1 and solution 2 also mean that if other services are introduced that do additional processing on pages then you have to modify another service which causes an undesired dependency.

Solution 3 is the simplest but as it can no longer process the pages in parallel it's very slow for documents with many pages.

0

There are 0 best solutions below