Refresh Data in druid

601 Views Asked by At

I am using the index_parallel native batch method to ingest data to Druid from s3. I have done the initial ingestion using Tasks tab from druid UI. I want to schedule another task to do delta ingestion daily.

I have gone through a lot of documentation, but I didn't find anything related to scheduling a task over druid.

Can someone help me out here what are all the ways through which we can schedule a native batch ingestion task?

1

There are 1 best solutions below

0
On

Typically you would use something like Airflow to schedule regular ingestions, e.g. see this cool blog post:

https://www.linkedin.com/pulse/open-source-data-warehousing-druid-apache-airflow-superset-sp%C3%A4ti/

Oh and take a gander at this too to be sure you know how to configure the job to add data:

https://druid.apache.org/docs/latest/ingestion/data-management.html#adding-new-data