(Ruby 2.5) I have a method that reads and parses a csv file that's being uploaded via Alchemy CMS
def process_csv(csv_file, current_user_id, original_filename)
lock_importer
errors = []
index = 0
string_converter = lambda { |field| field.strip }
total = CSV.foreach(csv_file, headers: true).count
csv_string = csv_file.read.encode!("UTF-8", "iso-8859-1", invalid: :replace)
CSV.parse(csv_string, headers: true, header_converters: :symbol, skip_blanks: true, converters: [string_converter] ) do |row|
# do other stuff
end
but when I try to upload a csv file that has a column (name) with a string that contains special characters then I receive the Invalid Byte Sequence in UTF-8 error. I'm trying to test the value N'öt Réal Stô'rë.
I've tried a few solutions that I found on the web but no luck - any suggestions?
It's unclear what your
csv_fileis. I guess it is a File-object.Sometimes I got csv from Excel as a UTF-16. So let's try an example:
I have a csv-file stored in UTF-16BE with the following content:
If I execute the following code:
then I get also a
Invalid byte sequence in UTF-8-error.If I use
then everything works.
So I guess, you get a File-object in a wron encoding and the code
csv_file.read.encode!("UTF-8", "iso-8859-1", invalid: :replace)is a code part to repair this problem.What you can do:
Add to you code:
You should get
Now check, if the file (in my example:
example_utf16BE.txthas really the encoding of the 2nd line.If not, try to adapt the File-object creation. If this is not possible, then you can try to use
csv_file.set_encoding 'utf-8'to change the encoding before you read the content.