In scala 2.12, why none of the TypeTag created in runtime is serializable?

536 Views Asked by At

I'm seeking a method to create a serializable TypeTag without using compile time tools (completely relying on runtime). This is a basic feature for all reflective language.

The answer in this post proposed a few methods:

In Scala, how to create a TypeTag from a type that is serializable?

NONE OF THEM WORKED:

package com.tribbloids.spike.scala_spike

import java.io.{
  ByteArrayInputStream,
  ByteArrayOutputStream,
  ObjectInputStream,
  ObjectOutputStream
}

import org.apache.spark.sql.catalyst.ScalaReflection
import org.apache.spark.sql.catalyst.ScalaReflection.universe
import org.scalatest.FunSpec

class TypeTagFromType extends FunSpec {

  import ScalaReflection.universe._

  it("create TypeTag from reflection") {

    val ttg = typeTag[String]
    val ttg2 = TypeUtils.createTypeTag_fast(ttg.tpe, ttg.mirror)
    val ttg3 = TypeUtils.createTypeTag_slow(ttg.tpe, ttg.mirror)

    Seq(
      ttg -> "from static inference",
      ttg2 -> "from dynamic type - fast",
      ttg3 -> "from dynamic type - slow"
    ).map {
      case (tt, k) =>
        println(k)

        try {

          val bytes = serialise(tt)
          val tt2 = deserialise(bytes)
          assert(tt.tpe =:= tt2.tpe)
        } catch {
          case e: Throwable =>
            e.printStackTrace()
        }

    }
  }

  def serialise(tt: universe.TypeTag[_]): Array[Byte] = {
    val bos = new ByteArrayOutputStream()
    try {
      val out = new ObjectOutputStream(bos)
      out.writeObject(tt)
      out.flush()
      val array = bos.toByteArray
      array
    } finally {
      bos.close()
    }
  }

  def deserialise(tt: Array[Byte]): TypeTag[_] = {

    val bis = new ByteArrayInputStream(tt)

    try {
      val in = new ObjectInputStream(bis)
      in.readObject().asInstanceOf[TypeTag[_]]

    } finally {
      bis.close()
    }

  }
}

object TypeUtils {

  import ScalaReflection.universe._

  def createTypeTag_fast[T](
      tpe: Type,
      mirror: Mirror
  ): TypeTag[T] = {
    TypeTag.apply(
      mirror,
      NaiveTypeCreator(tpe)
    )
  }

  def createTypeTag_slow[T](
      tpe: Type,
      mirror: Mirror
  ): TypeTag[T] = {

    val toolbox = scala.tools.reflect.ToolBox(mirror).mkToolBox()

    val tree = toolbox.parse(s"scala.reflect.runtime.universe.typeTag[$tpe]")
    val result = toolbox.eval(tree).asInstanceOf[TypeTag[T]]

    result
  }

  case class NaiveTypeCreator(tpe: Type) extends reflect.api.TypeCreator {

    def apply[U <: reflect.api.Universe with Singleton](
        m: reflect.api.Mirror[U]): U#Type = {
      //          assert(m eq mirror, s"TypeTag[$tpe] defined in $mirror cannot be migrated to $m.")
      tpe.asInstanceOf[U#Type]
    }
  }
}

For ttg2 and ttg3 created in runtime, An error is encountered when serializing or deserializing them, ttg2 encounter the error:

java.io.NotSerializableException: scala.reflect.runtime.JavaMirrors$JavaMirror$$anon$2
    at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1184)
    at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)
    at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509)
    at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
    at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
    at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)
    at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509)
    at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
    at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
    at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)
    at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509)
    at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
    at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
    at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)
    at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509)
    at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
    at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
    at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)
    at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509)
    at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
    at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
    at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)
    at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509)
    at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
    at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
    at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:348)
...

ttg3 encounter the error:

java.lang.ClassNotFoundException: __wrapper$1$71de08de01364321af52d1563247025d.__wrapper$1$71de08de01364321af52d1563247025d$$typecreator1$1
    at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
    at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:352)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:351)
    at java.lang.Class.forName0(Native Method)
    at java.lang.Class.forName(Class.java:348)
    at java.io.ObjectInputStream.resolveClass(ObjectInputStream.java:686)
    at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1868)
    at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1751)
    at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2042)
    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1573)
    at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2287)
    at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2211)
    at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2069)
    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1573)
    at java.io.ObjectInputStream.readObject(ObjectInputStream.java:431)
...

If you are familiar with the design of scala reflection, can you give a 'canonical' implementation that produce a functioning TypeTag?

1

There are 1 best solutions below

0
On BEST ANSWER

The only way I found so far to save ttg3/createTypeTag_slow is to persist classes in a disc rather than in memory during runtime compilation and modify the classpath accordingly

def createTypeTag_slow[T](
    tpe: Type,
    mirror: Mirror
): TypeTag[T] = {

  val toolbox = scala.tools.reflect.ToolBox(mirror).mkToolBox(options = "-d out")

  val addUrlMethodSymbol = mirror.universe.typeOf[java.net.URLClassLoader]
    .decl(TermName("addURL")).asMethod
  mirror.reflect(mirror.classLoader)
    .reflectMethod(addUrlMethodSymbol)(new java.net.URL("file:///full/path/to/projectname/out/"))

  val tree = toolbox.parse(s"scala.reflect.runtime.universe.typeTag[$tpe]")
  val result = toolbox.eval(tree).asInstanceOf[TypeTag[T]]

  result
}

Directory out must exist.

JDK 8.


So actually TypeTags calculated via Toolbox are serializable and deserializable, you just need to take care of manual serialization/deserialization for their TypeCreators

import java.io.{ByteArrayInputStream, ByteArrayOutputStream, File, ObjectInputStream, ObjectOutputStream}
import java.net.{URL, URLClassLoader}
import java.nio.file.{Files, Paths, StandardOpenOption}
import scala.reflect.runtime
import scala.reflect.runtime.universe.{Quasiquote, TermName, Type, TypeName, TypeTag, typeTag, typeOf}
import scala.tools.reflect.ToolBox
import scala.reflect.api.{TypeCreator, TypeTags}

object TypeTagsSerializationDeserialization {
  val root = "/full/path/to/projectname"
  val out = "out"
  val out1 = "out1"
  val rm = runtime.currentMirror
  val tb = rm.mkToolBox(options = s"-d $out")

  def main(args: Array[String]): Unit = {
    val ttag0 = typeTag[String]
    val ttag = evalTypeTag(ttag0.tpe)

    val tpecFullClassName = typeCreatorClassName(ttag)
    val (pckge, tpecClassName) = splitPackage(tpecFullClassName)

    // serialization
    val tcBytes = serializeFromFile(s"$root/$out/$pckge/$tpecClassName.class")
    val ttagBytes = serialize(ttag)

    // deserialization
    deserializeToFile(tcBytes, dirPath = s"$root/$out1/$pckge", s"$root/$out1/$pckge/$tpecClassName.class")
    loadToClasspathFromDir(s"$root/$out1/")
    val ttag1 = deserialize[TypeTag[_]](ttagBytes)

    assert(ttag0.tpe =:= ttag.tpe)
    assert(ttag.tpe =:= ttag1.tpe)
  }

  def evalTypeTag(tpe: Type): TypeTag[_] = 
    tb.eval(q"scala.reflect.runtime.universe.typeTag[$tpe]").asInstanceOf[TypeTag[_]]

  def typeCreatorClassName(ttag: TypeTag[_]): String = {
    val tpecMethodSymbol = typeOf[TypeTags].decl(TypeName("TypeTagImpl")).asClass
      .typeSignature.member(TermName("tpec")).asTerm.alternatives.find(_.isMethod).get.asMethod
    rm.reflect(ttag).reflectMethod(tpecMethodSymbol)().asInstanceOf[TypeCreator].getClass.getName
  }

  def splitPackage(fullClassName: String): (String, String) = {
    val parts = fullClassName.split('.')
    (parts.init.mkString("/"), parts.last)
  }

  def serializeFromFile(filePath: String): Array[Byte] =
    Files.readAllBytes(new File(filePath).toPath)

  def serialize(obj: Any): Array[Byte] = {
    val bos = new ByteArrayOutputStream
    try {
      val out = new ObjectOutputStream(bos)
      out.writeObject(obj)
      out.flush()
      bos.toByteArray
    } finally bos.close()
  }

  def deserializeToFile(bytes: Array[Byte], dirPath: String, filePath: String): Unit = {
    Files.createDirectory(Paths.get(dirPath))
    Files.write(Paths.get(filePath), bytes, StandardOpenOption.CREATE)
  }

  def loadToClasspathFromDir(dirPath: String): Unit = {
    val addUrlMethodSymbol = typeOf[URLClassLoader].decl(TermName("addURL")).asMethod
    rm.reflect(rm.classLoader).reflectMethod(addUrlMethodSymbol)(new URL(s"file://$dirPath"))
  }

  def deserialize[A](bytes: Array[Byte]): A = {
    val bis = new ByteArrayInputStream(bytes)
    try new ObjectInputStream(bis).readObject.asInstanceOf[A]
    finally bis.close()
  }
}

Regarding ttg2/createTypeTag_fast it complains that tpe: Type is not serializable (we could avoid this error with NaiveTypeCreator(@transient tpe: Type) but then tt2.tpe in tt.tpe =:= tt2.tpe would be null).

For statically known type T (rather than tpe: Type) with a macro we can generate manually what compiler generates for typeTag[T] (scalacOptions += "-Xprint:typer").

def createTypeTag_fast[T](mirror: Mirror[_ <: Universe with Singleton])/*(tpe: mirror.universe.Type)*/: mirror.universe.TypeTag[T] =
  macro createTypeTagFastImpl[T]

def createTypeTagFastImpl[T: c.WeakTypeTag](c: blackbox.Context)(mirror: c.Tree)/*(tpe: c.Tree)*/: c.Tree = {
  import c.universe._

  def symbolToTree(sym: Symbol): Tree = {
    val symName = sym.fullName    

    if (sym.isPackage)     q"m.staticPackage($symName).asModule.moduleClass"
    else if (sym.isClass)  q"m.staticClass($symName)"
    else if (sym.isModule) q"m.staticModule($symName).asModule.moduleClass"
    else ???
  }

  def typeToTree(typ: Type): Tree = {
    typ.dealias match {
      case TypeRef(pre, sym, args) =>
        q"""
          u.internal.reificationSupport.TypeRef(
            ${typeToTree(pre)},
            ${symbolToTree(sym)},
            List.apply[u.Type](..${args.map(typeToTree(_))})
          )
        """
      case ThisType(sym) =>
        q"""
          u.internal.reificationSupport.ThisType(
            ${symbolToTree(sym)}
          )
        """
      // ...
      case typ => q"${symbolToTree(typ.typeSymbol)}.asType.toTypeConstructor"
    }
  }

  val typT = weakTypeOf[T]
  val api = q"_root_.scala.reflect.api"
  q"""
     $mirror.universe.TypeTag.apply[$typT]($mirror,
        new $api.TypeCreator {
          override def apply[U <: $api.Universe with _root_.scala.Singleton](mUntyped: $api.Mirror[U]): U#Type = {
            val u: U = mUntyped.universe
            val m: u.Mirror = mUntyped.asInstanceOf[u.Mirror]
            ${typeToTree(typT)}
          }
        }
      )
  """
}

For dynamic tpe: Type we can do

import scala.language.experimental.macros
import scala.reflect.api
import scala.reflect.macros.blackbox

def createTypeTag_fast(mirror: api.Mirror[_ <: api.Universe with Singleton])(tpe: mirror.universe.Type): mirror.universe.TypeTag[_] =
  macro createTypeTagFastImpl
// def createTypeTag_fast[U <: api.Universe with Singleton](mirror: api.Mirror[U])(tpe: U#Type): U#TypeTag[_] =
//   macro createTypeTagFastImpl

def createTypeTagFastImpl(c: blackbox.Context)(mirror: c.Tree)(tpe: c.Tree): c.Tree = {
  import c.universe._
  val api = q"_root_.scala.reflect.api"
  q"""
    $mirror.universe.TypeTag.apply[$tpe]($mirror,
      new $api.TypeCreator {
        override def apply[U <: $api.Universe with _root_.scala.Singleton](mUntyped: $api.Mirror[U]): U#Type = {
          $tpe.asInstanceOf[U#Type]
        }
      }
    )
  """
}

How to create a TypeTag manually? (answer)

I guess $mirror.universe.TypeTag.apply[$tpe] is now not better than just $mirror.universe.TypeTag.apply[$mirror.universe.Type].


In Scala 3 inline method is enough rather than a macro

Scala Spark Encoders.product[X] (where X is a case class) keeps giving me "No TypeTag available for X" error