Using Pig to store data to Hbase error when using Hue

1.7k Views Asked by At

My CDH version is 5.1.2, my Hbase version is 0.98.1, my Hue version is 3.6.0. I executed this pig script to load data from Hbase in Hue

c = LOAD 'hbase://analyze_block_v1' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('d:*', '-loadKey true');
dump c;

I got this error:

ERROR org.apache.pig.tools.grunt.Grunt  - ERROR 2998: Unhandled internal error. org/apache/hadoop/hbase/mapreduce/TableInputFormat.

Follow a guide on google, i uploaded all hbase*.jars from folder hbase library to user/oozie/share/lib/lib_20140822104613/pig, and add the following statement to the top of script:

set hbase.zookeeper.quorum 'localhost' 

I still got the same error.

Continue google, I found the guide at GetHue and the guide from Cloudera website

Those guide told me that i must add these statement to top of pig script :

register /usr/lib/zookeeper/zookeeper-<ZooKeeper_version>-cdh<CDH_version>.jar

register /usr/lib/hbase/hbase-<HBase_version>-cdh<CDH_version>-security.jar

set hbase.zookeeper.quorum 'localhost'

The problem is i can't find hbase-0.98.1-cdh5.1.2-security.jar in my Hbase folder and library. I also downloaded and checked from this link http://www.cloudera.com/content/cloudera/en/downloads/cdh/cdh-5-1-2.html. And there is no hbase-0.98.1-cdh5.1.2-security.jar in the hbase folder. I tried download older version hbase-0.94.6-cdh4.5.0 and i can see the file hbase-0.94.6-cdh4.5.0-security.jar in the folder.

It seems they don't add the security.jar file in the newer version. I guess because missing this file, the first guide i found can't help

What should i do to fix the error ?


Thank to Romain's support that i can make it work. This is the detail how to do:

First need to upload these file from Hbase library folder to hdfs. In my case I uploaded them to /user/oozie/share/lib/lib_20140822104613/pig

zookeeper.jar

hbase-server-0.98.1-cdh5.1.2.jar

If you use Hue Pig Editor:

Put the jars as a resource in the 'Properties' --> 'Resources' of the script, then just put these statement to top of the script:

REGISTER ./hbase-server-0.98.1-cdh5.1.2.jar;

REGISTER ./zookeper.jar';

Run the script and it will work. In my case i even don't need to register the jar file, it still work. I think just put the jar file in 'properties--> resource' is enough

If you excute the pig script through Hue work flow:

Add File Path to the jar files, then submit the workflow, it will work. It dont need the register and set statement either.

1

There are 1 best solutions below

2
On BEST ANSWER

The doc is indeed not up to date but could you just try to include a jar that contains the missing class?

e.g.

find /usr/lib/hbase/ -name '*.jar' -exec grep -Hls TableInputFormat {} \;
/usr/lib/hbase/hbase-server-tests.jar
/usr/lib/hbase/hbase-server-0.98.6-cdh5.4.0-SNAPSHOT.jar
/usr/lib/hbase/hbase-server-0.98.6-cdh5.4.0-SNAPSHOT-tests.jar
/usr/lib/hbase/hbase-server.jar
/usr/lib/hbase/lib/hbase-server-0.98.6-cdh5.4.0-SNAPSHOT.jar
/usr/lib/hbase/lib/hbase-server-0.98.6-cdh5.4.0-SNAPSHOT-tests.jar