How to access the kubectl forwarded port on Spark Kubernetes cluster from spark-submit?

1.5k Views Asked by At

I have a spark cluster running on the inhouse-kubernetes cluster(managed with Rancher). Our company and the configuration of the cluster doesn't allow the services to be accessed from the:

spark://SERVICE_NAME.namespace.svc.domain.....

We created the cluster using the Big data Europe's yaml file with some obvious changes like resources.

Link to their github:

https://github.com/big-data-europe/docker-spark#kubernetes-deployment

The best thing about this approach is that we dont have to setup anything manually, the deployments, services etc. We just run the yaml file and everything is built for us in seconds.

Yaml file: https://raw.githubusercontent.com/big-data-europe/docker-spark/master/k8s-spark-cluster.yaml

To access the spark-ui what I simply do is create an ingress object and we are able to access it from outside. Cool!

apiVersion: extensions/v1beta1
kind: Ingress
metadata:
    name: spark-master
    labels:
      app: spark-master
    annotations:
        nginx.ingress.kubernetes.io/ssl-redirect: "false"
        nginx.ingress.kubernetes.io/hsts: "false"
spec:
    rules:
    - host: RANDOM_NAME.NAMESPACE.svc.k8s.CLUSTER.DOMAIN.com
      http:
        paths:
          - path: /
            backend:
              serviceName: spark-master
              servicePort: 8080

What I am trying to do is, access the spark cluster created by the BDE's given yaml file through my CLI on my work station. Because the service way(the proper way) isn't supported yet for us so I try to use the port-forwarding method

Some insight:

  • The spark master is on 7077
  • The spark UI is on 8080 (accessible through ingress object)
  • The spark rest is on 6066

kubectl -n <NAMESPACE> port-forward pods/spark-master-64bbbd7877-6vt6w 12345:7077

My kubectl is configured to connect to the cluster (Thank you Rancher for the ready to use config file)

But when I try to submit a job to the cluster via:


spark-submit --class org.apache.spark.examples.SparkPi --master spark://localhost:12345 --deploy-mode cluster \
--conf  spark.kubernetes.namespace=NAMESPACE \
--conf \spark.kubernetes.authenticate.submission.oauthToken=MY_TOKEN  \
--conf spark.kubernetes.file.upload.path=/temp C:\opt\spark\spark-3.0.0-bin-hadoop2.7\examples\jars\spark-examples_2.12-3.0.0.jar 1000

I get the error

Forwarding from 127.0.0.1:12345 -> 7077
Forwarding from [::1]:12345 -> 7077
Handling connection for 12345
E1014 13:17:45.039840   13148 portforward.go:400] an error occurred forwarding 12345 -> 7077: error forwarding port 7077 to pod f83c6b40d5af66589976bbaf69537febf79ee317288a42eee31cb307b03a954d, uid : exit status 1: 2020/10/14 11:17:45 socat[5658] E connect(5, AF=2 127.0.0.1:7077, 16): Connection refused

So in short, the submit command doesn't connect to the Spark Cluster deployed from my CLI.

I can run the spark submit using kubectl command as specified on the documentation of BDE but our requirement is to connect via CLI for some reasons.

Help in this regard would be highly appreciated. My token and other stuff is correct as in k8s mode, I am able to ping the cluster(with url) easily

EDIT:

I assume that the spark-master process creates a socket that does explicitly NOT bind to the address 0.0.0.0 but only to it's primary address. Since port-forwarding will use a loopback address within the pod, connections fail. ​ And I need to reconfigure the spark-master process to explicitly bind to 0.0.0.0. Does someone know a way to do that if that is the issue?

1

There are 1 best solutions below

0
On

thank you for your question and especially for your edit. It helped me figure out the problem and solve it.

I am using the bitnami helm chart to install spark on my cluster. Problem was the spark daemon was is started with parameter "--host" prefilled with "hostname -f", therefore not reacting to localhost. I solved the problem for the bitnami chart by setting the environment variable for the master pod "SPARK_MASTER_HOST" to 0.0.0.0.

EDIT: This solution still has the problem, that the backwards connection when starting jobs does not work, since the master assumes that the origin of the request is 127.0.0.1 :(. Probably a vpn tunnel is needed in order to solve this.