Error running tez in hive. Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.tez.TezTask

1k Views Asked by At
  • Hadoop 3.3.5
  • Hive 3.1.3
  • Tez 0.10.2

I follow the instruction in this link to build tez 0.10.2 for hadoop 3.3.5: https://tez.apache.org/install.html

The db is stored on s3 bucket and I am able to run 'select count(*) from m1.t1' using hive.execution.engine=mr.

When I set hive.execution.engine=tez, and run the same query, I got this error immediately:

2023-02-15T21:21:09,208  INFO [a6e2cd1a-b2c9-42d8-9568-8e0b64677f77 main] client.TezClient: App did not succeed. Diagnostics: Application application_1676506240754_0019 failed 2 times due to AM Contai
ner for appattempt_1676506240754_0019_000002 exited with  exitCode: 1
Failing this attempt.Diagnostics: [2023-02-15 21:21:08.730]Exception from container-launch.
Container id: container_1676506240754_0019_02_000001
Exit code: 1
[2023-02-15 21:21:08.732]Container exited with a non-zero exit code 1. Error file: prelaunch.err.
Last 4096 bytes of prelaunch.err :
Last 4096 bytes of stderr :
Error: Could not find or load main class org.apache.tez.dag.app.DAGAppMaster
[2023-02-15 21:21:08.733]Container exited with a non-zero exit code 1. Error file: prelaunch.err.
Last 4096 bytes of prelaunch.err :
Last 4096 bytes of stderr :
Error: Could not find or load main class org.apache.tez.dag.app.DAGAppMaster

If I set tez.use.cluster.hadoop-libs to true in tez-site.xml, I got YARN running but failed with load aws credential error even I have set the fs.s3a credentials in hadoop's core-site.xml, hive's hive-site.xml and .bashrc environment variables.

keys are masked to show sample only:

echo $AWS_ACCESS_KEY_ID
I9U996400005XXXXXXXX
echo $AWS_SECRET_KEY
mPY8GiU6NegNWoVnaODXXXXXXXXXXXXXXXXXXXX

hive> set hive.execution.engine=tez;
hive> select count(*) from m1.t1;
Query ID = hdp-user_20230215210146_62ed9fab-5d4a-42a9-bf54-5fb6f84a9048
Total jobs = 1
Launching Job 1 out of 1
Status: Running (Executing on YARN cluster with App id application_1676506240754_0015)

----------------------------------------------------------------------------------------------
        VERTICES      MODE        STATUS  TOTAL  COMPLETED  RUNNING  PENDING  FAILED  KILLED
----------------------------------------------------------------------------------------------
Map 1            container  INITIALIZING     -1          0        0       -1       0       0
Reducer 2        container        INITED      1          0        0        1       0       0
----------------------------------------------------------------------------------------------
VERTICES: 00/02  [>>--------------------------] 0%    ELAPSED TIME: 2.03 s
----------------------------------------------------------------------------------------------
Status: Failed
Vertex failed, vertexName=Map 1, vertexId=vertex_1676506240754_0015_3_00, diagnostics=[Vertex vertex_1676506240754_0015_3_00 [Map 1] killed/failed due to:ROOT_INPUT_INIT_FAILURE, Vertex Input: t1 initializer failed, vertex=vertex_1676506240754_0015_3_00 [Map 1], java.nio.file.AccessDeniedException: s3a://hadoop-cluster/warehouse/tablespace/managed/hive/m1.db/t1: org.apache.hadoop.fs.s3a.auth.NoAuthWithAWSException: No AWS Credentials provided by TemporaryAWSCredentialsProvider SimpleAWSCredentialsProvider EnvironmentVariableCredentialsProvider IAMInstanceCredentialsProvider : com.amazonaws.SdkClientException: Unable to load AWS credentials from environment variables (AWS_ACCESS_KEY_ID (or AWS_ACCESS_KEY) and AWS_SECRET_KEY (or AWS_SECRET_ACCESS_KEY))

Tried to add all fs.s3a properties from core-site.xml to tez-site.xml and set fs,s3a,access.key and set fs.s3a.secret.key= inside hive session but still get same error.

org.apache.hadoop.fs.s3a.auth.NoAuthWithAWSException: No AWS Credentials provided by TemporaryAWSCredentialsProvider SimpleAWSCredentialsProvider EnvironmentVariableCredentialsProvider IAMInstanceCredentialsProvider : com.amazonaws.SdkClientException: Unable to load AWS credentials from environment variables (AWS_ACCESS_KEY_ID (or AWS_ACCESS_KEY) and AWS_SECRET_KEY (or AWS_SECRET_ACCESS_KEY))

Question: according to tez install instruction

Ensure tez.use.cluster.hadoop-libs is not set in tez-site.xml, or if it is set, the value should be false

But when set to false, tez could not run. When set to true, I got aws credential error even though I set them in every possible location or environment variables.

========================================================== Update: Not sure if this is the right answer to this problem but I finally got it working by adding this property to hive-site.xml

  <property>
    <name>hive.conf.hidden.list</name>
    <value>javax.jdo.option.ConnectionPassword,hive.server2.keystore.password,fs.s3a.proxy.password,dfs.adls.oauth2.credential,fs.adl.oauth2.credential</value>
</property>

Default all fs.s3a credential are hidden config even you don't set this property. I explicitly add this property and remove all fs.s3a credential related from the value. Now, I can run select count(*) with tez.

1

There are 1 best solutions below

0
Sercan On

I got YARN running but failed with load aws credential

Hive removes the properties in hive.conf.hidden.list from job configuration before submitting any job to YARN. So, in your case, the missing properties fs.s3a.access.key and fs.s3a.secret.key cause No AWS Credentials provided.

Please see S3 filesystem related properties under hive.conf.hidden.list here;

+ ",fs.s3.awsAccessKeyId"
+ ",fs.s3.awsSecretAccessKey"
+ ",fs.s3n.awsAccessKeyId"
+ ",fs.s3n.awsSecretAccessKey"
+ ",fs.s3a.access.key"
+ ",fs.s3a.secret.key"
+ ",fs.s3a.proxy.password"

Your update solves the issue, and I think it is the correct solution. But, just for you to know, now you are exposing the S3 key passwords in various logs files. Some of the files that I know are as below;

Hive -> <HIVE_HOME>/logs/<user>/webhcat/webhcat.log.<date>

Hadoop -> <HADOOP_HOME>/logs/userlogs/application_<app#>/container_<container#>/history.txt.appattempt_<app#>

If you have access to source code, you can modify this method not to yield the mentioned properties in hive logs.