Airflow 2.1.4 Composer V2 GKE kubernetes in custom VPC Subnet returning 404

195 Views Asked by At

So I have two V2 Composers running in the same project, the only difference in these two is that in one of them I'm using the default subnet and default values/autogenerated values for cluster-ipv4-cidr & services-ipv4-cidr. In the other one I've created another subnet in the same (default VPC) which is in the same region, but a different IP range, and I reference this subnet when creating the Composer, additionally I give it the services-ipv4-cidr=xx.44.0.0/17 and services-ipv4-cidr=xx.45.4.0/22.

Everything else is the same between these two Composer environments. In the environment where I have a custom subnet I'm not able to run any KubernetsPodOperator jobs, they return the error:

ERROR - Exception when attempting to create Namespaced Pod:
Traceback (most recent call last):
  File "/opt/python3.8/lib/python3.8/site-packages/airflow/providers/cncf/kubernetes/utils/pod_manager.py", line 111, in run_pod_async
    resp = self._client.create_namespaced_pod(
  File "/opt/python3.8/lib/python3.8/site-packages/kubernetes/client/api/core_v1_api.py", line 6174, in create_namespaced_pod
    (data) = self.create_namespaced_pod_with_http_info(namespace, body, **kwargs)  # noqa: E501
  File "/opt/python3.8/lib/python3.8/site-packages/kubernetes/client/api/core_v1_api.py", line 6251, in create_namespaced_pod_with_http_info
    return self.api_client.call_api(
  File "/opt/python3.8/lib/python3.8/site-packages/kubernetes/client/api_client.py", line 340, in call_api
    return self.__call_api(resource_path, method,
  File "/opt/python3.8/lib/python3.8/site-packages/kubernetes/client/api_client.py", line 172, in __call_api
    response_data = self.request(
  File "/opt/python3.8/lib/python3.8/site-packages/kubernetes/client/api_client.py", line 382, in request
    return self.rest_client.POST(url,
  File "/opt/python3.8/lib/python3.8/site-packages/kubernetes/client/rest.py", line 272, in POST
    return self.request("POST", url,
  File "/opt/python3.8/lib/python3.8/site-packages/kubernetes/client/rest.py", line 231, in request
    raise ApiException(http_resp=r)
kubernetes.client.rest.ApiException: (404) 

and this pod does not appear if I go to GKE to check workloads. These two GKE envs use same composer service account, K8s service account and namespaces, but from my understand that is not an issue. Jobs outside of the K8sPodOperator work fine. I had a theory that perhaps the non-default subnet needed additional permissions but I wasn't able to confirm or deny this theory yet.

From the log I can see that the KubernetesPodOperator can't locate the worker, even though from the UI I can find it, and also non-KubernetesPodOperator jobs do this succesfully.

Would appreciate some guidance on what to do/where to look?

0

There are 0 best solutions below