Changing JDK on cluster deployed with ./spark-ec2

439 Views Asked by At

I have deployed an Amazon EC2 cluster with Spark like so:

~/spark-ec2 -k spark -i ~/.ssh/spark.pem -s 2 --region=eu-west-1 --spark-version=1.3.1 launch spark-cluster

I copy a file I need first to the master and then from master to HDFS using:

ephemeral-hdfs/bin/hadoop fs -put ~/ANTICOR_2_10000.txt ~/user/root/ANTICOR_2_10000.txt

I have a jar I want to run which was compiled with JDK 8 (I am using a lot of Java 8 features) so I copy it over with scp and run it with:

spark/bin/spark-submit --master spark://public_dns_with_port --class package.name.to.Main job.jar -f hdfs://public_dns:~/ANTICOR_2_10000.txt

The problem is that spark-ec2 loads the cluster with JDK7 so I am getting the Unsupported major.minor version 52.0

My question is, which are all the places where I need to change JDK7 to JDK8?

The steps I am doing thus far on master are:

  • Install JDK8 with yum
  • Use sudo alternatives --config java and change prefered java to java-8
  • export JAVA_HOME=/usr/lib/jvm/openjdk-8

Do I have to do that for all the nodes? Also do I need to change the java path that hadoop uses at ephemeral-hdfs/conf/hadoop-env.sh or are there any other spots I missed?

2

There are 2 best solutions below

0
On

Unfortunately, Amazon doesn't offer out-of-the-box Java 8 installations, yet: see available versions.

Have you seen this post on how to install it on running instances?

0
On

Here is what i have been doing for all java installations which are different from versions provided by default installations: -

  1. Configure the JAVA_HOME environment variable on each machine/ node: -

    export JAVA_HOME=/home/ec2-user/softwares/jdk1.7.0_25

  2. Modify the default PATH and place the "java/bin" directory before the rest of the PATH on all Nodes/ machines.

    export PATH=/home/ec2-user/softwares/jdk1.7.0_25/bin/:$M2:$SCALA_HOME/bin/:$HIVE_HOME/bin/:$PATH:

And the above needs to be done with the same "OS user" which is used to execute/ own the spark master/ worker process.