How to schedule a time for maintenance tasks for Delta Live Tables?

133 Views Asked by At

Here is the documentation from databricks

Delta Live Tables performs maintenance tasks within 24 hours of a table being updated. Maintenance can improve query performance and reduce cost by removing old versions of tables. By default, the system performs a full OPTIMIZE operation followed by VACUUM. You can disable OPTIMIZE for a table by setting pipelines.autoOptimize.managed = false in the table properties for the table. Maintenance tasks are performed only if a pipeline update has run in the 24 hours before the maintenance tasks are scheduled.

Is there any way i can schedule this commands at a particular time ?

currently i see this commands in history of table to run at different times and not during the pipeline run , Is there any way to schedule this ?

screenshot of history command

in above screenshot , I see DLT maintenance tasks run every 24 hrs , is there a way to schedule these tasks or run multiple times during a day ?

1

There are 1 best solutions below

1
Naveen Sharma On

You have the option to create a Databricks Job and set a scheduled time for its execution through the Databricks Jobs interface.

you can schedule maintenance tasks for Delta tables in Databricks using the `OPTIMIZE` and `VACUUM` commands in your notebooks

To set up a schedule for the job, follow these steps:

1. Navigate to the Workflows section in the sidebar.
2. Locate the job name in the Jobs tab within the Name column.
3. Within the Job details panel, click "Add trigger" and choose "Scheduled" as the Trigger type.
4. Define the period, starting time, and select the time zone.
5. Optionally, enable the "Show Cron Syntax" checkbox to view and modify the schedule using Quartz Cron Syntax.
6. Save your configuration by clicking the "Save" button.

Know more about Run a continuous job.