I get an execution error when I try to create a Schema for a dataframe in Spark Scala that says:
Exception in thread "main" java.lang.IllegalArgumentException: No support for Spark SQL type DateType
at org.apache.kudu.spark.kudu.SparkUtil$.sparkTypeToKuduType(SparkUtil.scala:81)
at org.apache.kudu.spark.kudu.SparkUtil$.org$apache$kudu$spark$kudu$SparkUtil$$createColumnSchema(SparkUtil.scala:134)
at org.apache.kudu.spark.kudu.SparkUtil$$anonfun$kuduSchema$3.apply(SparkUtil.scala:120)
at org.apache.kudu.spark.kudu.SparkUtil$$anonfun$kuduSchema$3.apply(SparkUtil.scala:119)
at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186)
at org.apache.kudu.spark.kudu.SparkUtil$.kuduSchema(SparkUtil.scala:119)
at org.apache.kudu.spark.kudu.KuduContext.createSchema(KuduContext.scala:234)
at org.apache.kudu.spark.kudu.KuduContext.createTable(KuduContext.scala:210)
where the code is like:
val invoicesSchema = StructType(
List(
StructField("id", StringType, false),
StructField("invoicenumber", StringType, false),
StructField("invoicedate", DateType, true)
))
kuduContext.createTable("invoices", invoicesSchema, Seq("id","invoicenumber"), new CreateTableOptions().setNumReplicas(3).addHashPartitions(List("id").asJava, 6))
How can I use the DateType for this matter? StringType and FloatType don't have this same issue in the same code
A work-around as I call it, with an example that you need to tailor, but gives you the gist of what you need to know I think: