I have a CSV file with data as below
id,name,comp_name
1,raj,"rajeswari,motors"
2,shiva,amber kings
my requirement is to read this file to spark RDD, then do map split with coma delimiter. but giving code this splits all comas val splitdata = data.map(_.split(",")
i do not want to split coma with in double quotes. But i DO NOT want to use REGEX expression. is there any simple efficient method to acheive this?
Also 2nd requirement is read above csv file to Spark Dataframe and show it but i need to see double quotes in result output should look like
id name comp_name
1 raj "rajeswari,motors"
2 shiva amber kings
double quotes are not shown normally but is any way to do it?
I am using spark 2.4 / scala 2.11 / Eclipse IDE
I would suggest try using dataframe instead of RDD?
There won't be direct way, you have to use regex like this below to ignore "," enclosed between ""
You'd get output like this
"rajeswari,motors"
amber kings
Refer this post for understanding expression : Splitting on comma outside quotes