How to store the customer first travel table at all time

56 Views Asked by At

I have a Customer, Travel topics. I want a analysis that show customer first travel travelid in the table.

for e.g.

create table customer_first_travel as
select  t.custid custid, earliest_by_offset(travelid) travelid 
from stream_travel t
join table_customer c on t.custid = c.custid
group by t.custid;

For the problem is, if the topic over the retention period, will the earliest travelid changed? As travelId is not PK at this, how can i tell the travelid has been deleted?

enter image description here

1

There are 1 best solutions below

0
On

if the topic over the retention period, will the earliest travelid changed?

Yes. But this is not a problem if your consumer is not 7 days behind the data being produced. (Also, this assumes that data is actually deleted on the 7th day, but Kafka can retain data longer if there are no closed log segments...)

If you have a low-lag consumer (i.e. building a table), then data is retained on a completely different, compacted topic.

how can i tell the travelid has been deleted

If the stream being consumed has a null-value event for a matching key in the table, it'll automatically be deleted from the table. There will be no "notification" for this, other than the event itself.