How to convert nested case class into UDTValue type

1.3k Views Asked by At

I'm struggling using custom case classes to write to Cassandra (2.1.6) using Spark (1.4.0). So far, I've tried this by using the DataStax spark-cassandra-connector 1.4.0-M1 and the following case classes:

case class Event(event_id: String, event_name: String, event_url: String, time: Option[Long])
[...]
case class RsvpResponse(event: Event, group: Group, guests: Long, member: Member, mtime: Long, response: String, rsvp_id: Long, venue: Option[Venue])

In order to make this work, I've also implemented the following converter:

implicit object EventToUDTValueConverter extends TypeConverter[UDTValue] {
  def targetTypeTag = typeTag[UDTValue]
  def convertPF = {
    case e: Event => UDTValue.fromMap(toMap(e)) // toMap just transforms the case class into a Map[String, Any]
  }
}

TypeConverter.registerConverter(EventToUDTValueConverter)

If I look up the converter manually, I can use it to convert an instance of Event into UDTValue, however, when using sc.saveToCassandra passing it an instance of RsvpResponse with related objects, I get the following error:

15/06/23 23:56:29 ERROR Executor: Exception in task 1.0 in stage 0.0 (TID 1)
com.datastax.spark.connector.types.TypeConversionException: Cannot convert object Event(EVENT9136830076436652815,First event,http://www.meetup.com/first-event,Some(1435100185774)) of type class model.Event to com.datastax.spark.connector.UDTValue.
    at com.datastax.spark.connector.types.TypeConverter$$anonfun$convert$1.apply(TypeConverter.scala:42)
    at com.datastax.spark.connector.types.UserDefinedType$$anon$1$$anonfun$convertPF$1.applyOrElse(UserDefinedType.scala:33)
    at com.datastax.spark.connector.types.TypeConverter$class.convert(TypeConverter.scala:40)
    at com.datastax.spark.connector.types.UserDefinedType$$anon$1.convert(UserDefinedType.scala:31)
    at com.datastax.spark.connector.writer.DefaultRowWriter$$anonfun$readColumnValues$2.apply(DefaultRowWriter.scala:46)
    at com.datastax.spark.connector.writer.DefaultRowWriter$$anonfun$readColumnValues$2.apply(DefaultRowWriter.scala:43)

It seems my converter is never even getting called because of the way the connector library is handling UDTValue internally. However, the solution described above does work for reading data from Cassandra tables (including user defined types). Based on the connector docs, I also replaced my nested case classes with com.datastax.spark.connector.UDTValue types directly, which then fixes the issue described, but breaks reading the data. I can't imagine I'm meant to define 2 separate models for reading and writing data. Or am I missing something obvious here?

1

There are 1 best solutions below

1
On BEST ANSWER

Since version 1.3, there is no need to use custom type converters to load and save nested UDTs. Just model everything with case classes and stick to the field naming convention and you should be fine.