Airflow Task Potentially Running Duplicated in Kubernetes - How to Confirm?

93 Views Asked by At

I'm currently using Apache Airflow with the Kubernetes Executor and I've noticed some suspicious behavior that makes me think a task might be running duplicated in Kubernetes. I'm trying to understand if this is the case and if so, why it's happening.

My main suspicion comes from watching the following logs while checking my pods

➜  ~ kubectl get pods -n scheduler
NAME                                                                READY   STATUS    RESTARTS   AGE
da-job-boards-pipeline-jobs-normalisation-task-job-title-492abcde   1/1     Running   0          29h
da-job-boards-pipeline-jobs-normalisation-task-locations-tkaabcde   1/1     Running   0          28h
scheduler-scheduler-58f557f548-abcde                                2/2     Running   0          6d14h
scheduler-statsd-7dd4494d4f-abcde                                   1/1     Running   0          13d
scheduler-triggerer-0                                               2/2     Running   0          8d
scheduler-webserver-5546b8dd66-abcde                                1/1     Running   0          3d6h
➜  ~ kubectl get pods -n airflow
NAME                                    READY   STATUS    RESTARTS   AGE
jobs-normalisation-job-title-2bpsbabc   1/1     Running   0          29h
jobs-normalisation-locations-nv670abc   1/1     Running   0          28h

Also, the logs are the same in:

  • jobs-normalisation-job-title-2bpsbabc and da-job-boards-pipeline-jobs-normalisation-task-job-title-492abcde
  • jobs-normalisation-locations-nv670abc and da-job-boards-pipeline-jobs-normalisation-task-locations-tkaabcde

CONFIGURATIONS

Here's the relevant configuration from my airflow.cfg:

[kubernetes]
airflow_configmap = scheduler-airflow-config
airflow_local_settings_configmap = scheduler-airflow-config
multi_namespace_mode = True
namespace = scheduler
pod_template_file = /opt/airflow/pod_templates/pod_template_file.yaml
worker_container_repository = SECRET.dkr.ecr.eu-west-1.amazonaws.com/airflow
worker_container_tag = efeb_THIS_IS_A_TAG

[kubernetes_executor]
multi_namespace_mode = True
namespace = scheduler
pod_template_file = /opt/airflow/pod_templates/pod_template_file.yaml
worker_container_repository = SECRET.dkr.ecr.eu-west-1.amazonaws.com/airflow
worker_container_tag = efeb_THIS_IS_A_TAG

[logging]
colored_console_log = False
delete_worker_pods = False
encrypt_s3_logs = True
logging_level = INFO
remote_base_log_folder = s3://scheduler-SECRET-eu-west-1/airflow/logs
remote_log_conn_id = aws_conn
remote_logging = True

In my DAG, I'm using the KubernetesPodOperator and these are the arguments that I suspect might be causing the tasks to duplicate:

    'node_selector': {"abcd.com/tenant": "scheduler"},
    'tolerations': [k8s.V1Toleration(key="abcd.com/tenant", operator="Equal", value="scheduler")],
    'namespace': "airflow",
    'service_account_name': "airflow",

Has anyone encountered a similar issue or can provide insights on whether these configurations might lead to duplicated task runs in Kubernetes? How can I confirm if the task is indeed running duplicated?

0

There are 0 best solutions below