Kafka Topology Design: How to do sliding window join and emit events on timeout? [Hard]

779 Views Asked by At

I have a set of requirements as below:

  1. Message 'T' arrives, must wait for 5 seconds for corresponding message in 'A' to arrive (with same key). If it comes within 5 seconds, then send joined values and send downstream. If it does not come within 5 seconds, send only 'T' message downstream.
  2. Message 'A' arrives, must wait for 5 seconds for corresponding message in 'T' to arrive (with same key). If it comes within 5 seconds, then send joined values and send downstream. If it does not come within 5 seconds, send only 'A' message downstream.

enter image description here

My current thinking was to do a KStream-KStream Sliding Window OUTER join. However, that does not wait for 5 seconds before sending the (T, null) or (null, T) message downstream (that is done instantly).

I need to wait for a timeout to happen, and if a join did not occur, then send the unjoined message through.

I've attached a diagram to help make sense of the cases. I am trying to use DSL as much as possible.

Any help appreciated.

1

There are 1 best solutions below

4
On

Okay I found a fairly hacky solution that i'm still evaluating, but will work for this scenario.

I can simply groupByKey at the end and then suppress until window expires, with an unbounded buffer.