Is this valid CSV escaping? Lumenworks throws an exception but other parsers seem to work

1.6k Views Asked by At

I'm updating a legacy system that parses CSVs. I decided to use the LumenWorks csv library. It struggles on records like this:

"Cats","123","A","B","Mittens","1","2","3","1950",""PROBLEM IS HERE"","Some street","Fishtown","","AB13DF","United Kingdom","","","","","","","United Kingdom","Fiddles"

As far as I can tell this should be escaped as """PROBLEM IS HERE""". Can anyone confirm? If it's valid then I need to find a fix, but if it's not I can inform the client that the CSV they provided isn't formed correctly.

Also if there's a way around this using LumenWorks (a non-hack way ideally) that will prevent it throwing an exception, that'd be good to know. Thanks!

I should add that LumenWorks gives me this: LumenWorks.Framework.IO.Csv.MalformedCsvException: The CSV appears to be corrupt near record ...

1

There are 1 best solutions below

2
On BEST ANSWER

While the Lumenworks site doesn't specify that it's compliant with RFC 4180, the current expected escaping is specified as:

7. If double-quotes are used to enclose fields, then a double-quote appearing inside a field must be escaped by preceding it with another double quote. For example:

"aaa","b""bb","ccc"

Therefore, to include the string "PROBLEM IS HERE", you need to double the quotes, producing ""PROBLEM IS HERE"", then enclose it in double quotes, producing """PROBLEM IS HERE""".