Efficient way to delete record in ksqldb table

72 Views Asked by At

I have following

  1. stream "my_stream" with id(key)

  2. source table "my_table" with id(primary key)

  3. stream-table joined "my_joined" (show below)

  4. stream "my_table_tombstone" for sending "tombstone" message to delete outdated data in "my_table"

    CREATE OR REPLACE STREAM `my_joined` AS
     SELECT
         `my_stream`.`id`,
         `my_table`.`price`
         FROM `my_stream` JOIN `my_table` ON `my_stream`.`id` = `my_table`.`id`;
    

I have a script run in every 10 minute for deleting outdated data in "my_table" to keep disk space

  1. select outdated data in "my_table" using ROWTIME
  2. send "tomobstone" message via restful api to delete data in "my_table" via (id(key),null)

everything works good except performance. it took 1927 seconds to delete 103000 records in table (52 record/s). from my observation the bottleneck is in ksqldb. My question is, does anyone have better approach to delete data or build a TTL based table?

here are some useful references I found

https://groups.google.com/g/ksql-users/c/FaPOc_lyGtM

https://developer.confluent.io/tutorials/schedule-ktable-ttl/kstreams.html

0

There are 0 best solutions below