When I use MapReduce as the execution engine behind Hive, I am able to use an alternate backend to my defaultFS.impl. Using syntax similar to:
LOCATION 'protocol://address:port/dir';
I would like to use the Tez execution engine instead of MapReduce, but can't figure out where to add my shim libraries (jar files) in order for Tez to recognize my new protocol.
What directory do these go in? Do I need to add directives to tez-site.conf?
Additional input:
Vertex failed, vertexName=Map 6,
vertexId=vertex_1504790331090_0003_1_01, diagnostics=[Vertex
vertex_1504790331090_0003_1_01 [Map 6] killed/failed due
to:ROOT_INPUT_INIT_FA
ILURE, Vertex Input: item initializer failed,
vertex=vertex_1504790331090_0003_1_01 [Map 6],
java.lang.RuntimeException: java.lang.ClassNotFoundException: Class
org.apache.
hadoop.fs.nfs.NFSv3FileSystem not found
at
org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2241)
at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2780)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2793)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:99)
at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2829)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2811)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:390)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:295)
at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.generateSplitsInfo(OrcInputFormat.java:1227)
at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getSplits(OrcInputFormat.java:1285)
at org.apache.hadoop.hive.ql.io.HiveInputFormat.addSplitsForGroup(HiveInputFormat.java:307)
at org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:409)
at org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.initialize(HiveSplitGenerator.java:155)
at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:273).....
I've added my NFS Connector jar files to /usr/hdp//hiveserver2/lib and restarted my Hive servers. I also added the aux path to hive-conf.xml:
<property>
<name>hive.aux.jars.path</name>
<value>file:///netappnfs/hadoop-nfs-connector-2.0.0.jar</value>
</property>
I think I need to load the class but am not sure how to do so in Hive. In generic hadoop, its loaded with:
<name>fs.AbstractFileSystem.nfs.impl</name>
<value>org.apache.hadoop.fs.nfs.NFSv3AbstractFilesystem</value>
Is there any equivalent for Hive?
Add jar detail to Hive-site.xml ( /etc/hive/conf) .
ie
Restart hive