Why execution date is in the past when running a DAG with Airflow?

121 Views Asked by At

i have something i don't understand with the execution date. I have the following dag :

from airflow import DAG
from airflow.operators.bash_operator import BashOperator
from datetime import datetime

default_args = {
    'owner': 'me',
    'depends_on_past': True,
    'email': '[email protected]',
    'email_on_failure': False,
    'email_on_retry': False,
    'retries': 0,
}

dag = DAG(
    'dag_test',
    default_args=default_args,
    description="DAG test",
    schedule_interval='0 15 * * *',
    concurrency=1,
    catchup=False,
    start_date=datetime(2024, 1, 1)
)

task = BashOperator(
    task_id='task',
    bash_command='echo 1',
    dag=dag,
)

When i activate the dag, it is running everyday at 3PM but the execution date is the day before. Example : when the dag is triggered on the 16th february, the execution date is 15th february.

Thanks for your help.

I expect to have the same date between the trigger and the execution date.

2

There are 2 best solutions below

0
vdolez On BEST ANSWER

You need to have a look at data-interval for DAG runs.

A DAG run is usually scheduled after its associated data interval has ended, to ensure the run is able to collect all the data within the time period. In other words, a run covering the data period of 2020-01-01 generally does not start to run until 2020-01-01 has ended, i.e. after 2020-01-02 00:00:00.

A DAG run is executed at the end of the period of time it covers to respect idempotence principles.

A best practice to design DAGs is to handle data using time-partitioning.

0
joss On

Think of the trigger as an alarm set to trigger at some time in the future. When you set your alarm yesterday for today, and the alarm goes off, the alarm will pop up saying the alarm that you set yesterday has gone off.

This is what happened, the execution date for the run is the date on which the trigger was set, so yesterday's date.