what are encoding options for csvcut?

155 Views Asked by At

I'm getting errors using csvcut stating that the input is not UTF-8 encoded. according to the documentation https://csvkit.readthedocs.io/en/latest/scripts/csvcut.html there's a command line option [-e ENCODING] that doesn't appear to have any further documentation. What are the list of valid encoding options?

In my particular case, the input files are strictly one byte per character; csvcut is complaining about anything above 0x7f. The files have a smattering of, e.g. 0x92 (Right single quotation mark)

If i scrub the characters above 0x7f, it all works ; but it would seem to me there's a more elegant / straightforward solution.

adding (guessing!) the command line option -e ASCII didn't seem to do anything(?)

1

There are 1 best solutions below

0
Ed Beighe On

thanks @skomisa :yes, latin , latin1 , latin-1, and iso-8859-1 are all accepted by the -e command line option, all seem to do the same thing, and all allow csvcut to deal with the file.

Now I have another problem, but that's for another question!