Serverless spark job throwing an error while using shared VPC to connect on-prem storage

654 Views Asked by At

I am trying to run simple serverless spark(dataproc batch) job which reads object from on-prem ECS with shared VPC. I have open egress firewall in shared vpc to connect on-prem storage but I don't see that firewall rule is getting hit There are very less resources available at the moment since this is a new product GA recently.

Failed to initialize node gdpic-srvls-batch-fxxxx7-cxx6-4xxd-bxx6-6xxxxxx4-m: Timed out waiting for at least 1 worker(s) registered. This is often caused by firewall rules that prevent Spark workers from communicating with the master. Please review your network firewall rules and be sure they allow communication on all ports between all nodes. See https://cloud.google.com/dataproc-serverless/docs/concepts/network for instructions. See output in: gs://gcs-object-xxxx

I tried looking into url provided but couldn't find much details. If I have to setup NAT gateway with shared VPC project with my project how can I do? Has anyone solved this problem already?

1

There are 1 best solutions below

5
On

Usually it means that there is no connectivity between internal IP addresses in your VPC. Adding firewall rule as shown in the linked doc, should help:

gcloud compute firewall-rules create allow-internal-ingress \
  --network="network-name" \
  --source-ranges="subnetwork internal-IP ranges" \
  --direction="ingress" \
  --action="allow" \
  --rules="all"