I've encountered a problem using MySQL on Docker. When I directly insert non-ASCII characters in the database using the initialization sql script, the characters are correctly shown on MySQL's console, but their encodings are wrong.
I coded a MySQL container with a minimal sql script to reproduce the problem.
Here's the structure of my directory:
.
├── docker-compose.yml
└── my-sql
├── Dockerfile
└── ddl
└── mySQL.sql
docker-compose.yml
version: '3.8'
services:
mysql:
build:
context: ./my-sql/
dockerfile: Dockerfile
container_name: mysql
expose:
- 3306
ports:
- 3306:3306
environment:
MYSQL_ROOT_PASSWORD: test
MYSQL_USER: test
MYSQL_PASSWORD: test
MYSQL_DATABASE: test
volumes:
- "mysql:/var/lib/mysql"
volumes:
mysql:
Dockerfile
FROM mysql:8.2.0
USER 999:999
COPY ./ddl /docker-entrypoint-initdb.d/
mySQL.sql
CREATE TABLE IF NOT EXISTS `test`(
`test` VARCHAR(100)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
INSERT INTO `test` (`test`) VALUES ("");
INSERT INTO `test` (`test`) VALUES ("平");
ALTER DATABASE test CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci; # Does not work
When I use MySQL using the console and that I select the test column of the test table, I get this:
mysql> SELECT `test` FROM `test`;
+------+
| test |
+------+
| |
| 平 |
+------+
mysql> SELECT HEX(`test`) FROM `test`;
+------------------+
| HEX(`test`) |
+------------------+
| C3B0C5B8C2A4C2AE |
| C3A5C2B9C2B3 |
+------------------+
I did some research to find the correct encoding of these characters in various encodings and I didn't see these encodings, and as far as my knowledges go, maximum size for UTF-8 character is 4 bytes. I also noticed that the "|" alignment of what MySQL prints is wrong (and proportionally wronger to the size of the character hexadecimal encoding).
I looked at the hexadecimal encoding of the .sql script using VScode and, at least, the emoji is correctly encoded (F0 9F A4 AE).
I also tried another MySQL version (8.0.36), but it still doesn't work.
Thanks in advance