I want to convert SCollection[String]
to Seq[String]
or List[String]
.
For example, I have a variable called ids.
val ids: SCollection[String] = ~
ids.saveAsTextFile(pathToGCS)
When I save it to Cloud Storage, the contents of the text file are a table of IDs.
id1
id2
id2
I want to keep the contents of a file as Seq or List.
val seqOdIds: Seq[String] = ~
Not within the same job since Dataflow doesn't have the notion of a driver node like Spark to collect data from worker nodes. See https://spotify.github.io/scio/Scio%2C-Scalding-and-Spark.html#scio-and-spark
You can use the tap API to read file content after job completion. See https://spotify.github.io/scio/examples/TapOutputExample.scala.html