Is it possible to use Option[_] member in a case class used with Dataset API? eg. Option[Int]
I tried to find an example but could not find any yet. This can probably be done with with a custom encoder (mapping?) but I could not find an example for that yet.
This might be achievable using Frameless library: https://github.com/adelbertc/frameless but there should be an easy way to get it done with base Spark libraries.
Update
I am using: "org.apache.spark" %% "spark-core" % "1.6.1"
And getting the following error when trying to use an Option[Int]:
Unable to find encoder for type stored in a Dataset. Primitive types (Int, String, etc) and Product types (case classes) are supported by importing sqlContext.implicits._ Support for serializing other types will be added in future releases
Solution update
Since I was prototyping I was just declaring the case class inside the function before the conversion to the Dataset (in my case inside object Main {).
Option types worked just fine when I moved the case class outside of the Main function.
We only define implicits for a subset of the types we support in SQLImplicits. We should probably consider adding
Option[T]for commonTas the internal infrastructure does understandOption. You can workaround this by either creating acase class, using aTupleor constructing the required implicit yourself (though this is using and internal API so may break in future releases).