Running Kubernetes Job using Kubernetes Pod Operator in Airflow

517 Views Asked by At

I have a code which we want to run in n number of Pods simulataneously. When I was running manually I used to launch Kubernetes Job by giving Parallelism and Completetions in the yaml File.

apiVersion: batch/v1
kind: Job
metadata:
  annotations:
    eks.amazonaws.com/role-arn: "arn:aws:iam::{number}:role/{access_name}"
  name: test-job
  namespace: analytics
spec:
  completions: 1000
  parallelism: 1000
  template:

Now I want to Automate this process with Airflow, however, airflow only has KubernetesPodOperator but no JobOperator. Is there any way I can achieve the same using KubernetesPodOperator.

Limitations:

  1. We can't use any other library due to very strict restrictions, so need to get the Job done using the default available operators in Airflow

I have tried creating N number of KubernetesPodoperator resulting in N number of Tasks. However, the number of parallelisms is dynamic and if the parallelism that we want is very large (like 100K) creating that many tasks in Airflow is not feasible. So looking for a way to achieve this using only 1 single Task

0

There are 0 best solutions below