Scala Spark : trying to avoid type erasure when using overload

368 Views Asked by At

I'm relatively new to Scala/Spark

I'm trying to overload one function depending on the class type into a DStream

def persist(service1DStream: DStream[Service1]): Unit = {...}
def persist(service2DStream: DStream[Service2]): Unit = {...}

I'm getting a compilation error:

persist(_root_.org.apache.spark.streaming.dstream.DStream) is already defined in the scope

Seems that it's due to type erasure. How to make compiler recognize that DStream[Service1] is different from DStream[Service2]

Thank you

1

There are 1 best solutions below

2
On
def persist(serviceDStream: DStream[Any]): Unit = serviceDStream match {
case _: DStream[Service1] => println("it is a Service1")
case _: DStream[Service2] => println("it is a Service2")    
case _ => println("who knows")     
}

improved solution re runtime type erasure using shapeless more info shapeless-guide:

import org.apache.spark.sql.SparkSession
import org.apache.spark.streaming.dstream.DStream
import shapeless.TypeCase

object Test {

 def main(args: Array[String]): Unit = {

val spark = SparkSession
  .builder
  .getOrCreate()

case class Service1 (a: String)
case class Service2 (a: Int)

val Service1Typed = TypeCase[DStream[Service1]]
val Service2Typed    = TypeCase[DStream[Service2]]

def persist(serviceDStream: DStream[Any]): Unit = serviceDStream match {

  case Service1Typed => println("it is a Service1")
  case Service2Typed => println("it is a Service2")
  case _ => println("who knows")
}

}

}

You can also use import scala.reflect.ClassTag more info: ClassTag example