How to import multiple CSVs with different encodings into one data frame?

268 Views Asked by At

I'm trying to put together two answers on the site to figure out my situation, but no luck so far.

Essentially I have several CSVs with the same columns but different encodings, which means that when I try the approach here, I also have to iterate through my list of encodings, which I generated this way:

encodings_raw = !chardetect data/*.csv
encodings = [x.split('csv: ')[1].split(' with')[0] for x in encodings_raw]

The value of encodings is:

['Windows-1252', 'UTF-8-SIG', 'ISO-8859-1', 'Windows-1252', 'UTF-8-SIG', 'UTF-8-SIG', 'Windows-1252', 'Windows-1252', 'Windows-1252', 'Windows-1252', 'Windows-1252']

I tried a bunch of things but as I typed out question, I figured out the answer so I'll just post it below.

1

There are 1 best solutions below

0
Khashir On

You have to do:

df = pd.concat((pd.read_csv(f, encoding=e) for f,e in zip(data_files,encodings)))