Excel .csv lines don't contain quotations in Excel, but other editors add them. MySQL doesn't like this

43 Views Asked by At

I am using DbVisualizer 23.1 to edit MySQL queries. I am learning through the various sources that exist, and I've run into a frustrating problem. I created two .csv files in Excel 365 with the tables you can see here: https://sqlbolt.com/lesson/select_queries_with_expressions

Movies and Boxoffice only have the A column populated with ',' as the expected delimiter. I don't have quotations around the lines, but I do have quotations around "Monsters, Inc." [that's not the compiling error].

When I open the .csv in Notepad or MySQL there are quotations around each line. It's frustrating but that's fine because we use ENCLOSED BY '"' in our INFILE block, right? When I try to run the following:

LOAD DATA INFILE 'C:/ProgramData/MySQL/MySQL Server 8.0/Uploads/Boxoffice.csv'
INTO TABLE Boxoffice
FIELDS TERMINATED BY ','
OPTIONALLY ENCLOSED BY '"'
LINES TERMINATED BY '\n'
(Movie_id, Rating, Domestic_sales, International_sales);

I receive the error:

[Code: 1366, SQL State: HY000] Incorrect integer value: '"5,8.2,380843261,555900000" "14,7.4,268492764,475066843" "8,8,206445654,417277164" "12,6.4,191452396,368400000" "3,7.9,24585' for column 'Movie_id' at row 1

It doesn't appear to like the quotations that appear in the Excel-created csv. I can get the table populated if I include SET Id = CAST(REPLACE(@Movie_id, '"', '') AS UNSIGNED) but am I going to have to do this every time? All of the literature I read tells me that ENCLOSED BY '"' or OPTIONALLY ENCLOSED BY '"' should make the quotations a moot point. What am I missing?

Edit: Adding the Notepad version of the Boxoffice.csv (remember the Excel version doesn't have the quotations that I discussed above):

"5,8.2,380843261,555900000"
"14,7.4,268492764,475066843"
"8,8,206445654,417277164"
"12,6.4,191452396,368400000"
"3,7.9,245852179,239163000"
"6,8,261441092,370001000"
"9,8.5,223808164,297503696"
"11,8.4,415004880,648167031"
"1,8.3,191796233,170162503"
"7,7.2,244082982,217900167"
"10,8.3,293004164,438338580"
"4,8.1,289916256,272900000"
"2,7.2,162798565,200600000"
"13,7.2,237283207,301700000"
1

There are 1 best solutions below

4
Lelio Faieta On

I'd add the following line to your instructions:

LINES TERMINATED BY '"\n' STARTING BY '"'

that should do the trick saying to mysql that the double quotes are just wrapping an entire new line.

[Source][1]

So your code should become:

LOAD DATA INFILE 'C:/ProgramData/MySQL/MySQL Server 8.0/Uploads/Boxoffice.csv'
INTO TABLE Boxoffice
FIELDS TERMINATED BY ','
LINES TERMINATED BY '"\n' STARTING BY '"'
(Movie_id, Rating, Domestic_sales, International_sales);

See that this is untested and may require some more tweaks to adjust to your real situation [1]: https://dev.mysql.com/doc/refman/8.0/en/load-data.html