My SQL statement is not comparing unicode emojis properly?

117 Views Asked by At

fullEmoji is a property stored in discord.js ParsedEmoji object which should just be a literal emoji ''.

The ParsedEmoji is just an interface but I get that from calling parseEmoji() which is a private it's mainly for the discord custom emojis since if the ParsedEmoji.id is not valid I know it's either a unicode or not an emoji and then I just do a regex test with emoji-regex to see if it's a unicode emoji

const fullEmoji: string = parsedEmoji.id ? 
`<:${parsedEmoji.name}:${parsedEmoji.id}>` : parsedEmoji.name;

connPool.query<ResultSetHeader>(`
    DELETE FROM emoji_role_links WHERE messages_reactable_id = ${reactableMsg.id}
    AND emoji = '${fullEmoji}';
`)

When I run this query it ends up deleting all emoji-link rows and I'm assuming it's because the AND emoji = '${fullEmoji}' is not properly working.

The charset I am using is utf8mb4 and collation utf8mb4_0900_ai_ci

Listing the Rows below:


id, emoji, role_id, messages_reactable_id

'3', '', '1083419092192600086', '2'

'4', '', '1099145715982278696', '2'


SHOW CREATE TABLE:

Table, Create Table

CREATE TABLE emoji_role_links (
 id int unsigned NOT NULL AUTO_INCREMENT,
 emoji varchar(64) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci DEFAULT NULL,
 role_id varchar(22) COLLATE utf8mb4_unicode_ci NOT NULL,
 messages_reactable_id int unsigned DEFAULT NULL,
 PRIMARY KEY (id),
 UNIQUE KEY role_id_UNIQUE (role_id),
 KEY fk_messages_reactable_id_idx (messages_reactable_id),
 CONSTRAINT fk_messages_reactable_id FOREIGN KEY (messages_reactable_id) REFERENCES messages_reactable (id)
) ENGINE=InnoDB AUTO_INCREMENT=5 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci

So this is showing the collation to be utf8mb4_unicode_ci

I don't know what more to try I have tried a little researching and couldn't find much directly related to this I don't know what exact problem I have here since this is all in the query itself

one thing I would try if there is no way to get this to work is to escape the emoji's and unescaped them again later but I would have to rewrite a lot of code so I'm saving that as a last resort

1

There are 1 best solutions below

1
Rick James On BEST ANSWER

SHOW VARIABLES LIKE 'coll%'; to see if the connection the test will be performed with. Note that the column is using utf8mb4_unicode_ci.

Note that 'ai_ci' treats those emogi as different, but the [vary old] 'unicode' collation treats them as equal.

mysql> SELECT '' = '' COLLATE utf8mb4_0900_ai_ci;
+--------------------------------------+
| '?' = '?' COLLATE utf8mb4_0900_ai_ci |
+--------------------------------------+
|                                    0 |
+--------------------------------------+
1 row in set (0.00 sec)

mysql> SELECT '' = '' COLLATE utf8mb4_unicode_ci;
+--------------------------------------+
| '?' = '?' COLLATE utf8mb4_unicode_ci |
+--------------------------------------+
|                                    1 |
+--------------------------------------+
1 row in set (0.00 sec)

unicode refers to Unicode standard 4.0; 0900 refers to 9.0.