I have a ksqlDB stream that is doing a CREATE STREAM AS SELECT lhs.*, rhs.* FROM lhs LEFT JOIN rhs WITHIN 15 SECONDS GRACE PERIOD 30 SECONDS ON rhs.join_key = lhs.join_key.

Generally in my application, there should always be a RHS message for each LHS message, and the grace period used should be sufficient for expected out-of-order messages. However, if for some reason the RHS message fails to arrive or is much later, I'd like to be able to track that so that a human can be notified to investigate and possibly redrive the messages.

Is there a way to do that apart from having a consumer application consume from the two source streams and determine missing messages by its own logic?

Some approaches I've considered:

  • change the query to a FULL OUTER JOIN, and use a subsequent query to filter for the results stream and a different subsequent query to filter for messages that failed to join. I think in that case, I'd still need a consumer but at least the logic to find the missing messages would be done.
  • same as above but have subsequent queries do an INSERT INTO to try to automatically redrive the messages?
0

There are 0 best solutions below