I am facing issue with pyspark2.2 csv writer output

38 Views Asked by At

I want to migrate my pyspark code from 1.6 to 2.x. In 1.6 I was using syntax

input_df.repartition(number_of_files) \
    .write.mode(file_saveMode) \
    .format(file_format) \
    .option("header", "true") \
    .save(nfs_path)

And was getting output in below format.

part-00000

part-00001

. .

I ran the same code in pyspark2.2, it gave me different part file names

part-00000-2feefae7-47d7-4f1a-ade6-7dbd07f42f54-c000.csv

part-00001-2feefae7-47d7-4f1a-ade6-7dbd07f42f54-c000.csv

Then I change the code as per 2.x

input_df.repartition(number_of_files) \
    .write.mode(file_saveMode) \
    .option("header", "true") \
    .csv(nfs_path)

But still the same result

part-00000-2feefae7-47d7-4f1a-ade6-7dbd07f42f54-c000.csv

Can anyone help why this is happening?

0

There are 0 best solutions below