I have tried to configured Impala to run on top of Alluxio, but failed.
Here is the Impala configurations:
/etc/impala/conf/core-site.xml(http://www.alluxio.org/docs/1.6/en/Running-Hadoop-MapReduce-on-Alluxio.html)
<configuration>
<property>
<name>fs.alluxio.impl</name>
<value>alluxio.hadoop.FileSystem</value>
<description>The Alluxio FileSystem (Hadoop 1.x and 2.x)</description>
</property>
<property>
<name>fs.AbstractFileSystem.alluxio.impl</name>
<value>alluxio.hadoop.AlluxioFileSystem</value>
<description>The Alluxio AbstractFileSystem (Hadoop 2.x)</description>
</property>
</configuration>
/etc/impala/conf/hive-site.xml(http://www.alluxio.org/docs/1.6/en/Running-Hive-with-Alluxio.html)
<property>
<name>fs.defaultFS</name>
<value>alluxio://master_hostname:port</value>
</property>
Then I started Impala(impala-server, impala-catalogd, impala-state-store), but in the log I found this:
...impala-server.cc:282] Currently configured default file system: FileSystem. fs.defaultFS (alluxio://192.168.1.10:19998/) is not supported.
...impala-server.cc:285] Aborting Impala Server startup due to improper configuration. Impalad exiting.
I have searched a lot on Bing but got no luck. Even there is few result on search key words 'impala on alluxio'. So can impala run on top of alluxio? Any suggestions will be appreciated.
My Impala version: 2.10.0-cdh5.13.0 RELEASE, Alluxio version: alluxio-1.8.0-hadoop-2.7
Have you tried running Hive with external tables on Alluxio? Instead of setting Alluxio as defaultFS, remove
and use something like the following to create a table on Alluxio:
That might help workaround Impala's filesystem implementation check. Also there is a bug in CDH 5.13 and below which prevents Impala from reading data in Alluxio. You might want to upgrade to CDH 5.14 which fixed that issue.