Spark Structured Streaming application not restarting correctly

34 Views Asked by At

I have a Spark Structured Streaming application (Spark version 3.2.2) running on Kubernetes. When the Spark driver restarts (from crashes or manual shutdowns), it often crashes with one of these 2 errors and cannot recover: org.apache.spark.sql.execution.streaming.state.InvalidUnsafeRowException or org.apache.spark.sql.execution.streaming.state.StateSchemaNotCompatible.

It seems related to corrupted checkpoints when the application stops. I tried to implement a shutdown hook to gracefully shutdown the application as a workaround. What can cause this issue in my Spark application?

0

There are 0 best solutions below