http code 302 when getting file from Spark UI on Dataproc cluster

709 Views Asked by At

I started up a Dataproc cluster and am having problems using the Web UI on port 4040. First I show the IP and port displayed by spark-shell. Then I show the 302 error code when I wget a URL on the Spark UI port.

wilsonbill522@cluster-db78-m:~$ spark-shell Setting default log level to "WARN". To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel). ivysettings.xml file not found in HIVE_HOME or HIVE_CONF_DIR,/etc/hive/conf.dist/ivysettings.xml will be used Spark context Web UI available at http://10.128.0.2:4040

Using wget to port 4040, I get a response with HTTP code 302 for every URL except the "jobs" URL for example:

wget http://10.128.0.2:4040/proxy/application_1505052986245_0002/static/timeline-view.js

Unfortunately I can't post the output since stackoverflow decides that I'm posting links. But the result of the above command is an HTTP 302 response code.

The above wget is executed on the master node in a different ssh session. The 302 response redirects to the "jobs" URL (I can't spell out the actual URL here since it will trigger some limit stackoverflow places on links) which doesn't make any sense.

1

There are 1 best solutions below

0
On

It looks like the Spark Web UI is formatting all URLs to be relative to the YARN resource manager proxy, but is using the wrong host/port for some parts of it's display (specifically what is shown in the spark-shell output). I'm guessing that this is an artifact of running in YARN-client mode, but am not yet certain.

As Dennis Huo alluded to, you can access the URLs that are redirecting via the YARN RM proxy on port 8088.