How to create tar.gz file using apache beam

348 Views Asked by At

I used the below to create a tar.gz file, and .gz file was created but tar file was not available. How to achieve the result?

PCollection<String> lines = pipeline.apply("To read from file",
    TextIO.read().from(<file path>));

lines.apply(TextIO.write()
    .to("C:\\Ddrive\\saveAllRequest1.txt")
    .withCompression(Compression.GZIP);

and how to tar multiple files.

Thanks in advance

1

There are 1 best solutions below

1
On

Creating a tar file is an inherently non-parallel operation. One option here is to use the Wait transform after your write with a DoFn that manually creates the tarball.