write pyspark dataframe to csv with out outer quotes

234 Views Asked by kavya At 20 October 2024 at 10:56

I have a dataframe with a single column as below. I am using pyspark version 2.3 to write to csv.

18391860-bb33-11e6-a12d-0050569d8a5c,48,24,44,31,47,162,227,0,37,30,28
18391310-bc74-11e5-9049-005056b996a7,37,0,48,25,72,28,24,44,31,52,27,30,4

In default the output for the code is

df.select('RESULT').write.csv(path)

"18391860-bb33-11e6-a12d-0050569d8a5c,48,24,44,31,47,162,227,0,37,30,28"
"18391310-bc74-11e5-9049-005056b996a7,37,0,48,25,72,28,24,44,31,52,27,30,4"

How can I remove the outer quotes? I have tried option('quoteAll','false') and option('quote',None) which did not work.

There are 2 best solutions below

mck On 16 February 2021 at 11:55 BEST ANSWER

You can try writing with a | separator. The default is , which conflicts with your content that contains commas.

df.select('RESULT').write.csv(path, sep="|")

blackbishop On 16 February 2021 at 12:06

You can also use spark.write.text:

df.select('RESULT').write.text(path)