How to convert SCollection[String] to Seq[String] or List[String]?

1.2k Views Asked by At

I want to convert SCollection[String] to Seq[String] or List[String].

For example, I have a variable called ids.

val ids: SCollection[String] = ~
ids.saveAsTextFile(pathToGCS) 

When I save it to Cloud Storage, the contents of the text file are a table of IDs.

id1
id2
id2

I want to keep the contents of a file as Seq or List.

val seqOdIds: Seq[String] = ~

1

There are 1 best solutions below

0
On

Not within the same job since Dataflow doesn't have the notion of a driver node like Spark to collect data from worker nodes. See https://spotify.github.io/scio/Scio%2C-Scalding-and-Spark.html#scio-and-spark

You can use the tap API to read file content after job completion. See https://spotify.github.io/scio/examples/TapOutputExample.scala.html