I've read Airflow's FAQ about "What's the deal with start_date?", but it still isn't clear to me why it is recommended against using dynamic start_date.
To my understanding, a DAG's execution_date is determined by the minimum start_date between all of the DAG's tasks, and subsequent DAG Runs are ran at the latest execution_date + schedule_interval.
If I set my DAG's default_args start_date to be for, say, yesterday at 20:00:00, with a schedule_interval of 1 day, how would that break or confuse the scheduler, if at all? If I understand correctly, the scheduler would trigger the DAG with an execution_date of yesterday at 20:00:00, and the next DAG Run would be scheduled for today at 20:00:00.
Is there some concept that I'm missing?
First run would be at
start_date+schedule_interval. It doesn't run dag onstart_date, it always runs onstart_date+schedule_interval.As they mentioned in document if you give
start_datedynamic for e.g.datetime.now()and give someschedule_interval(1 hour), it will never execute that run asnow()moves along with time anddatetime.now()+ 1 houris not possible