pyspark Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient

198 Views Asked by At

New to spark and tried other solutions from stackoverflow but no luck

I have installed spark 3.1.2 and did few configuration setup (user spark/conf/spark-defaults.conf) to point aws rds mysql as a metastore (remote)

spark.jars.packages com.amazonaws:aws-java-sdk:1.12.63,org.apache.hadoop:hadoop-aws:3.2.0
spark.jars /home/newdependencies/jtds-1.3.1.jar, /home/newdependencies/mysql-connector-java-6.0.6.jar, /home/newdependencies/postgresql-42.2.20.jar
spark.hadoop.javax.jdo.option.ConnectionURL jdbc:mysql://testhivemetastore.asdfasfar.us-west-2.rds.amazonaws.com:3306/metastore
spark.hadoop.javax.jdo.option.ConnectionUserName username
spark.hadoop.javax.jdo.option.ConnectionPassword password
spark.hadoop.javax.jdo.option.ConnectionDriverName com.mysql.jdbc.Driver

Error message - when trying to run show databases,

import os.path, sys
sys.path.append(os.path.join(os.path.dirname(os.path.realpath('__file__')), os.pardir))
import findspark
findspark.init()
import pyspark
sp = pyspark.sql.SparkSession.builder.enableHiveSupport().appName(f"Test spark configurations").getOrCreate()
sqlStr = 'show databases'
sp.sql(sqlStr).show()

Error: org.apache.spark.sql.AnalysisException: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient

FYI - I didn't install Hadoop and Hive as well (Don't know its mandatory or not)

0

There are 0 best solutions below