I'm trying to update a table that was working fine a minute ago, but now I've hit an error that I've got in the past that's extremely inconvenient.
09:32:57 Copying rows caused a MySQL error 1300:
Level: Warning
Code: 1300
Message: Invalid utf8mb4 character string: '94C494'
Is there an option or something so that I can ignore these warnings? After all it is just a warning, so MySQL still has the ability to move forward and copy the rows anyway.
Other solutions would be fine, too, like a way to find the offending value in the table, or to remove the invalid pieces, or just something so I can alter my table would be amazing.
My Googling on how to find/replace broken utf8 sequences in MySQL bring to me to these links (that aren't so helpful)
- https://dba.stackexchange.com/questions/77101/how-to-find-non-utf8-data-in-mysql/77154
- Detecting utf8 broken characters in MySQL
I've even tried searching for all columns where the hex contains the invalid sequence and still somehow with no luck
select * from `notifications`
where hex(`Description`) like '%94C494%'
or hex(`Title`) like '%94C494%'
or hex(`NotificationID`) like '%94C494%'
or hex(`ToUserID`) like '%94C494%'
or hex(`FromUserID`) like '%94C494%'
or hex(`Link`) like '%94C494%'
or hex(`Icon`) like '%94C494%';
MySQL is version 5.7.18-15-57-log and pt-online-schema-change 3.0.8
Even stranger, I decided to humor myself and search ALL columns (not just the utf8mb4 ones) and I got rows! But the only rows I got were from my Binary columns? Why would having invalid utf8 sequences matter in a binary column? Now I think it might be a bug with the tool
select * ,hex(`notificationid`), hex(`notificationbid`), hex(`fromuserid`), hex(`touserid`), hex(`title`), hex(`description`), hex(`read`), hex(`datetimeadded`), hex(`link`), hex(`icon`), hex(`shown`), hex(`_linkdescriptionsha256`), hex(`_touseridlinkdescription+sha3-224`)
from `notifications`
where hex(`notificationid`) like '%94C494%'
or hex(`notificationbid`) like '%94C494%'
or hex(`fromuserid`) like '%94C494%'
or hex(`touserid`) like '%94C494%'
or hex(`title`) like '%94C494%'
or hex(`description`) like '%94C494%'
or hex(`read`) like '%94C494%'
or hex(`datetimeadded`) like '%94C494%'
or hex(`link`) like '%94C494%'
or hex(`icon`) like '%94C494%'
or hex(`shown`) like '%94C494%'
or hex(`_linkdescriptionsha256`) like '%94C494%'
or hex(`_touseridlinkdescription+sha3-224`) like '%94C494%';
After working with Percona support on this issue, we eventually led to the creation of this ticket: https://jira.percona.com/browse/PT-1528
The trick for ignoring collation issues (or working around this bug) is by adding the
--charset binary
flag to thept-online-schema-change
command.Seems to happen when the primary key is a binary column and the charset is either set to utf8(mb4) or inferred as one of those from the MySQL settings.