spark-ec2 and Tachyon hadoop version disparity

328 Views Asked by At

I try to use spark-ec2 to launch ec2 cluster with hadoop version 2.x, so I tried:

./spark-ec2 -k spark -i ~/.ssh/spark.pem -s 1 --hadoop-major-version=2 launch my-spark-cluster

then I found out there are error in the tachyon setting up process:

Setting up tachyon
RSYNC'ing /root/tachyon to slaves...
ec2-52-1-147-16.compute-1.amazonaws.com
ec2-52-1-147-16.compute-1.amazonaws.com: Formatting Tachyon Worker @ ip-172-31-21-86.ec2.internal
ec2-52-1-147-16.compute-1.amazonaws.com: Removing local data under folder: /mnt/ramdisk/tachyonworker/
Formatting Tachyon Master @ ec2-52-1-14-186.compute-1.amazonaws.com
Formatting JOURNAL_FOLDER: /root/tachyon/libexec/../journal/
Exception in thread "main" java.lang.RuntimeException: org.apache.hadoop.ipc.RemoteException: Server IPC version 7 cannot communicate with client version 4
    at tachyon.util.CommonUtils.runtimeException(CommonUtils.java:246)
    at tachyon.UnderFileSystemHdfs.<init>(UnderFileSystemHdfs.java:73)
    at tachyon.UnderFileSystemHdfs.getClient(UnderFileSystemHdfs.java:53)
    at tachyon.UnderFileSystem.get(UnderFileSystem.java:53)
    at tachyon.Format.main(Format.java:54)
Caused by: org.apache.hadoop.ipc.RemoteException: Server IPC version 7 cannot communicate with client version 4
    at org.apache.hadoop.ipc.Client.call(Client.java:1070)
    at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
    at com.sun.proxy.$Proxy1.getProtocolVersion(Unknown Source)
    at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:396)
    at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:379)
    at org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:119)
    at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:238)
    at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:203)
    at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:89)
    at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1386)
    at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66)
    at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1404)
    at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:254)
    at org.apache.hadoop.fs.Path.getFileSystem(Path.java:187)
    at tachyon.UnderFileSystemHdfs.<init>(UnderFileSystemHdfs.java:69)
    ... 3 more

I've searched for some related question and it seems that Server IPC version 7 cannot communicate with client version 4 means that server is using hadoop 2.x and client is using hadoop 1.x. However, I built my spark with hadoop 2.4.0 and I also tried the official spark pre-built version with hadoop 2.4.0 and later, both lead to the same error.

By the way, hadoop version created by setting --hadoop-major-version=2 is Hadoop 2.0.0-cdh4.2.0. Is this a problem? But I tried to use 2.4 or 2.4.0 here, neither of them are recognized as valid hadoop version

0

There are 0 best solutions below