Spark-Scala quote issue

595 Views Asked by user8062565 At 29 July 2025 at 04:31

I have my input data in ISO-8859-1 format. It is a cedilla delimited file. The data has a double quote in it. I am converting the file to UTF8 format. When doing so, spark is inserting some escape character and more quotes. What can i do to make sure that the extra quotes and escape character is not added to the output?

Sample Input

XYZÇVIB BROS CRANE AND BIG "TONYÇ1961-02-23Ç00:00:00

Sample Output

XYZÇ"VIB BROS CRANE AND BIG \"TONY"Ç1961-02-23Ç00:00:00

Code

var InputFormatDataFrame = sparkSession.sqlContext.read
                .format("com.databricks.spark.csv")
                .option("delimiter", delimiter)
                .option("charset", input_format)
                .option("header", "false")
                .option("treatEmptyValuesAsNulls","true")
                .option("nullValue"," ")
                .option("quote","")
                .option("quoteMode","NONE")
                //.option("escape","\"")
                .option("ignoreLeadingWhiteSpace", "true")
                .option("ignoreTrailingWhiteSpace", "true")
                .option("mode","FAILFAST")
                .load(input_location)
                InputFormatDataFrame.write.mode("overwrite").option("delimiter", delimiter).option("charset", "UTF-8").csv(output_location)

Original Q&A

Spark-Scala quote issue

There are 0 best solutions below

Related Questions in SCALA

Related Questions in APACHE-SPARK

Related Questions in APACHE-SPARK-SQL

Related Questions in FILE-CONVERSION

Trending Questions

Popular # Hahtags

Popular Questions