Scheduling Airflow timezone-aware dags based on my local timezone

226 Views Asked by At

I have a timezone-aware airflow dag, and I want to schedule it 9 am everyday in my local timezone. But what I see is that ds, next_ds and data_interval_end template variables are all in UTC. That changes the run date for my dags.

At 9 am Sydney time today (UTC+11), UTC time is 1 day behind (10 pm yesterday). That means I get the date wrong in both ds and next_ds vars. How do I workaround that? I want to be able to schedule dags in my local timezone and get the correct date for today, not having to calculate UTC times.

This problem can be still be handled programmatically in Python code, because you can play around with the dates, but for SQL templates if I want to use {{ ds }} or {{ next_ds }} as partition for my data, it translates to the wrong dates.

So if I am running the dag on 29th at 9 am Sydney time (UTC + 11), this is what I get in the Airflow template variables:

[2023-11-29, 13:08:12 AEDT] {{subprocess.py:93}} INFO - Next_ds is 2023-11-28. [2023-11-29, 13:08:12 AEDT] {{subprocess.py:93}} INFO - ds is 2023-11-27. [2023-11-29, 13:08:12 AEDT] {{subprocess.py:93}} INFO - data_interval_end is 2023-11-28T22:00:00+00:00.

The documentation says the date intervals are always calculated in UTC even though you have set Airflow to your local timezone.

How do you handle that, specially for SQL templates?

1

There are 1 best solutions below

3
On

you can access the dag timezones in airflow macros. See below.

DECLARE @ds DATE = '{{ dag_run.data_interval_end.astimezone(dag.timezone) | ds }}';