I'm having this type of CSV file:
12012;My Name is Mike. What is your's?;3;0
1522;In my opinion: It's cool; or at least not bad;4;0
21427;Hello. I like this feature!;5;1
I want to get this data into da pandas.DataFrame
.
But read_csv(sep=";")
throws exceptions due to the semicolon in the user generated message column in line 2 (In my opinion: It's cool; or at least not bad). All remaining columns constantly have numeric dtypes.
What is the most convenient method to manage this?
Dealing with unquoted delimiters is always a nuisance. In this case, since it looks like the broken text is known to be surrounded by three correctly-encoded columns, we can recover. TBH, I'd just use the standard Python reader and build a DataFrame once from that:
which produces
Then we can immediately save it and get something quoted correctly: