Using Akubra-HDFS with Cloudera CDH4

256 Views Asked by At

I am experimenting with using Akubra-HDFS as the low level storage for the fedora commons server. I followed the procedure similar to installation of IRODS to setup Akubra-HDFS. The fedora server worked with Hadoop (version 1.0.4) as its storage. But, I had trouble using the Akubra-HDFS library with either Cloudera CDH4 / Apache Hadoop 2.0.3alpha - the High Availability (HA) distributions. I wanted to share my findings.


There are 1 best solutions below


As AKubra-HDFS is a new experimental library there is not much resources about it in the internet. I had to figure out the solution by experimenting out with different dependency jars.

Follow the instructions at

Instead of the jars listed there for Hadoop 1.x.x, Copy the below jar files from CDH4 (installation folder) to the fedora lib folder.

  1. commons-cli-1.2.jar
  2. commons-configuration-1.6.jar
  3. commons-lang-2.4.jar
    • google-collections-1.0-rc2.jar (This has to removed from the fedora lib folder).
  4. guava-11.0.2.jar (This is a newer version of google-collections)
  5. hadoop-auth-2.0.0-cdh4.1.3.jar
  6. hadoop-common-2.0.0-cdh4.1.3.jar
  7. hadoop-hdfs-2.0.0-cdh4.1.3.jar
  8. protobuf-java-2.4.0a.jar