.format("org.apache.phoenix.spark") vs .format("jdbc")

48 Views Asked by Mahadi Siregar At 29 November 2023 at 10:11

I wonder what is the difference of using .format("org.apache.phoenix.spark") vs .format("jdbc") when loading HBase table (through Phoenix) to spark dataframe.

val tracesDF = spark.sqlContext.read
  .format("org.apache.phoenix.spark")
  .option("table", hbaseTblName)
  .option("zkUrl", appConf.getString("zookeeper_url"))

val tracesDF = spark.sqlContext.read
  .format("jdbc")
  .option("driver", "org.apache.phoenix.jdbc.PhoenixDriver")
  .option("url", appConf.getString("hbasedb_url"))

Another issue I found which related to this issue:

I create the HBase table through jdbc statement hbaseCon.createStatement().execute('CREATE TABLE ...)
The dataframe of .format("org.apache.phoenix.spark") is empty, while .format("jdbc") return the data properly
Need to specify column family [tracesDF.select(...,"``B.SAMPLES_BINARY``")] when using .format("org.apache.phoenix.spark") but not when using .format("jdbc") [tracesDF.select(...,"SAMPLES_BINARY")]

Original Q&A

.format("org.apache.phoenix.spark") vs .format("jdbc")

There are 0 best solutions below

Related Questions in APACHE-SPARK

Related Questions in APACHE-SPARK-SQL

Related Questions in HBASE

Related Questions in APACHE-PHOENIX

Trending Questions

Popular # Hahtags

Popular Questions