How do I save spark.writeStream results in hive?

933 Views Asked by Divya At 28 July 2025 at 15:09

I am using spark.readStream to read data from Kafka and running an explode on the resulting dataframe. I am trying to save the result of the explode in a Hive table and I am not able to find any solution for that. I tried the following method but it doesn't work (it runs but I don't see any new partitions created)

val query = tradelines.writeStream.outputMode("append")
  .format("memory")
  .option("truncate", "false")
  .option("checkpointLocation", checkpointLocation)
  .queryName("tl")
  .start() 

sc.sql("set hive.exec.dynamic.partition.mode=nonstrict;")

sc.sql("INSERT INTO TABLE default.tradelines PARTITION (dt) SELECT * FROM tl")

Original Q&A

There are 1 best solutions below

OneCricketeer On 19 December 2017 at 19:56

Check HDFS for the dt partitions on the file system

You need to run MSCK REPAIR TABLE on the hive table to see new partitions.

If you aren't doing anything special with Spark, then it's worth pointing out that Kafka Connect HDFS is capable of registering Hive partitions directly from Kafka.

How do I save spark.writeStream results in hive?

There are 1 best solutions below

Related Questions in APACHE-SPARK

Related Questions in APACHE-KAFKA

Related Questions in SPARK-STRUCTURED-STREAMING

Related Questions in SPARK-HIVE

Trending Questions

Popular # Hahtags

Popular Questions