I'm trying to run a very simple scio App (using Scala:2.13, gradle:7.2, openjdk:1.8) and deploy in Google Dataflow:
package testing
import com.spotify.scio.ContextAndArgs
object HelloWorld {
def main(args: Array[String]): Unit = {
val (sc, _) = ContextAndArgs(args)
sc.parallelize(Seq.range(1, 1024)).map(println)
sc.run()
}
}
The App works well with scio version 0.9.2 but doesn't work with the latest version (0.11.4). I can still submit Dataflow job from Intellij IDEA but it doesn't run and gives error after 1 hour:
The Dataflow job appears to be stuck because no worker activity has been seen in the last 1h. Please check the worker logs in Stackdriver Logging. You can also get help with Cloud Dataflow at https://cloud.google.com/dataflow/support.
Anyone encountered a similar problem?
It seems the java classpath is missing something. The job can work if I change GRADLE_USER_HOME out of USER_HOME but I don't think it's a good solution.
See scio code: https://github.com/spotify/scio/blob/bcc86a9756a5eb54370ee45f6587dc59652dd805/scio-core/src/main/scala/com/spotify/scio/ScioContext.scala#L142