Can Impala run on top of Alluxio?

443 Views Asked by At

I have tried to configured Impala to run on top of Alluxio, but failed.

Here is the Impala configurations:

/etc/impala/conf/core-site.xml(http://www.alluxio.org/docs/1.6/en/Running-Hadoop-MapReduce-on-Alluxio.html)

<configuration>
<property>
  <name>fs.alluxio.impl</name>
  <value>alluxio.hadoop.FileSystem</value>
  <description>The Alluxio FileSystem (Hadoop 1.x and 2.x)</description>
</property>
<property>
  <name>fs.AbstractFileSystem.alluxio.impl</name>
  <value>alluxio.hadoop.AlluxioFileSystem</value>
  <description>The Alluxio AbstractFileSystem (Hadoop 2.x)</description>
</property>
</configuration>

/etc/impala/conf/hive-site.xml(http://www.alluxio.org/docs/1.6/en/Running-Hive-with-Alluxio.html)

<property>
   <name>fs.defaultFS</name>
   <value>alluxio://master_hostname:port</value>
</property>

Then I started Impala(impala-server, impala-catalogd, impala-state-store), but in the log I found this:

...impala-server.cc:282] Currently configured default file system: FileSystem. fs.defaultFS (alluxio://192.168.1.10:19998/) is not supported.
...impala-server.cc:285] Aborting Impala Server startup due to improper configuration. Impalad exiting.

I have searched a lot on Bing but got no luck. Even there is few result on search key words 'impala on alluxio'. So can impala run on top of alluxio? Any suggestions will be appreciated.

My Impala version: 2.10.0-cdh5.13.0 RELEASE, Alluxio version: alluxio-1.8.0-hadoop-2.7

1

There are 1 best solutions below

2
On

Have you tried running Hive with external tables on Alluxio? Instead of setting Alluxio as defaultFS, remove

<property>
   <name>fs.defaultFS</name>
   <value>alluxio://master_hostname:port</value>
</property>

and use something like the following to create a table on Alluxio:

hive> CREATE TABLE u_user (
userid INT,
age INT,
gender CHAR(1),
occupation STRING,
zipcode STRING)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '|'
LOCATION 'alluxio://master_hostname:port/table_path';

That might help workaround Impala's filesystem implementation check. Also there is a bug in CDH 5.13 and below which prevents Impala from reading data in Alluxio. You might want to upgrade to CDH 5.14 which fixed that issue.