Long story short: we have a PHP-based self-developed CMS, originally on PHP5.x and MySQL, using a healthy combination of utf8 and iso-8859-1 char-sets (don't judge, I know it's weird but it's working). On our production environment our server provider upgraded to PHP7.2 and (after a few weeks of refactoring) everything works just fine.
Parallel to this production environment I've set up (or at least I tried to) a test environment for our development, VirtualBox Ubuntu 20.04, apache2.4, PHP7.2 and MySQL5.7.
in /etc/php/7.2/apache2/php.ini I have:
default_charset = "iso-8859-1"
in /etc/mysql/my.cnf I have:
[client]
default-character-set = utf8
[mysqld_safe]
default-character-set = utf8
[mysql]
default-character-set = utf8
[mysqld]
init_connect = 'SET NAMES utf8'
character-set-client-handshake = false #force encoding to uft8
character-set-server = utf8
collation-server = utf8_unicode_ci
Now, on our development server the character_set_client=utf8mb4
and character_set_results=utf8mb4
and I can't find a way to change it.
The problem is, that when I try to import on our development server dumps from our production server (through our CMS), or when I try to save texts with special characters like ü or ä it always cuts the word at the occurrence and saves only the rest, e.g. instead of chüd will save only ch or instead of einträge it saves only eintr.
However I can save ü manually in DB without a problem (don't have to use ü
)
(we have a second development server, Ubuntu 14.04, apache2.4, PHP5.6, MySQL5.7 and basically the same settings as on PHP7.2 testserver, and everything works fine)
Maybe PHP7.2 is doing the mess here, I am really out of ideas.
Any help will be appreciated. Thank you
See "truncation" in Trouble with UTF-8 characters; what I see is not what I stored
I wonder if having apache not set to UTF-8 messes up
<form>s
.init_connect = 'SET NAMES utf8'
sets 3CHARACTER_SET_%
values if you are not connecting as "root". So, change it toutf8mb4
and do not connect as "root".Are you sure about the encoding in the imported data? (I suspect this causes the truncation problem.) Can you get a hex dump of a small portion of the data.
For Western European languages, MySQL's
utf8
andutf8mb4
work the same. That is, theinit_connect
that you have should be adequate _if the incoming data is really UTF-8, not iso...For reference here are hex values: