I have a ksqlDB stream that is doing a CREATE STREAM AS SELECT lhs.*, rhs.* FROM lhs LEFT JOIN rhs WITHIN 15 SECONDS GRACE PERIOD 30 SECONDS ON rhs.join_key = lhs.join_key
.
Generally in my application, there should always be a RHS message for each LHS message, and the grace period used should be sufficient for expected out-of-order messages. However, if for some reason the RHS message fails to arrive or is much later, I'd like to be able to track that so that a human can be notified to investigate and possibly redrive the messages.
Is there a way to do that apart from having a consumer application consume from the two source streams and determine missing messages by its own logic?
Some approaches I've considered:
- change the query to a
FULL OUTER JOIN
, and use a subsequent query to filter for the results stream and a different subsequent query to filter for messages that failed to join. I think in that case, I'd still need a consumer but at least the logic to find the missing messages would be done. - same as above but have subsequent queries do an
INSERT INTO
to try to automatically redrive the messages?