I've written a WordPress plugin which sends out new post notifications. There is a setting to convert subject lines from html entites to quoted-printable so they'll display in UTF-8 on any email client. A few weeks ago I started getting reports that the quoted-printable subject line was being kept as-is instead of being decoded.
Sample Subject header:
Subject: =?UTF-8?Q?[Pranamanasyoga]=20Foro=20Pranamanasyoga=20:=20estr?= =?UTF-8?Q?=C3=A9s=20y=20resilencia?=
I cannot replicate it locally and have not been able to find any common denominators between reporters.
The code that generates the quoted-printable line is this:
<?php
$enc = iconv_get_encoding( 'internal_encoding' ); // this is UTF-8
$preferences = ['input-charset' => $enc, 'output-charset' => "UTF-8", 'scheme' => 'Q' ];
$filtered_subject = '[Pranamanasyoga] Foro Pranamanasyoga : estrés y resilencia';
$encoded = iconv_mime_encode( 'Subject', html_entity_decode( $filtered_subject ), $preferences );
$encoded = substr( $encoded, strlen( 'Subject: ' ) );
If I try decoding it, it works fine:
$decoded = iconv_mime_decode($encoded, 0, "UTF-8");
var_dump(['encoded' => $encoded, 'decoded' => $decoded])."\n";
Result:
array(2) {
["encoded"]=>
string(102) "=?UTF-8?Q?[Pranamanasyoga]=20Foro=20Pranamanasyoga=20:=20estr?=
=?UTF-8?Q?=C3=A9s=20y=20resilencia?="
["decoded"]=>
string(59) "[Pranamanasyoga] Foro Pranamanasyoga : estrés y resilencia"
}
One thing I noticed, but think is not related is that my code actually adds a newline before the second =?UTF-8?Q?
piece and the email subject header does not have it. Decoding the strings with- and without the newline works the same.
Does anyone have ideas/suggestions on what may be causing the email clients (Gmail included) to display the string as-is, instead of decoding it to UTF-8?
P.S. While writing this I saw a suggestion to use mb_encode_mimeheader()
in a different thread. It seems to work well with iconv_mime_decode()
in my test code, but the output string is indeed different from the original one:
[Pranamanasyoga] Foro Pranamanasyoga : =?UTF-8?Q?estr=C3=A9s=20y=20resile?=
=?UTF-8?Q?ncia?=
Could it be that email clients would prefer this format over the original one?