PHP emoji to unicode not converting more than one emoji appropriately

1k Views Asked by At

This function converts emoji to unicode

function emoji_to_unicode($emoji) {
   $emoji = mb_convert_encoding($emoji, 'UTF-32', 'UTF-8');
   $unicode = strtoupper(preg_replace("/^[0]+/","U+",bin2hex($emoji)));
   return $unicode;
}

usage

$var = ("");
echo  emoji_to_unicode($var);

So it returns to me U+1F600 the problem is if I add more emoji on $var it only returns the first emoji, example of return bellow:

$var = ("");
echo  emoji_to_unicode($var);

returns to me U+1F6000001F600 when it should return U+1F600 U+1F600

It works fine when convert a single emoji but not working when convert multiple emojis

2

There are 2 best solutions below

6
On BEST ANSWER

One way to do this is to iterate over each character in $var, converting it as you go. Note that to make the function more robust, you should only replace 3 leading zeros (so as not to mess up values that e.g. start with 4). That way the function will work with all characters. I've also added a check (using mb_ord) that the character needs conversion, so that it works with plain text too:

function emoji_to_unicode($emoji) {
    if (mb_ord($emoji) < 256) return $emoji;
    $emoji = mb_convert_encoding($emoji, 'UTF-32', 'UTF-8');
    $unicode = strtoupper(preg_replace("/^[0]{3}/","U+",bin2hex($emoji)));
    return $unicode;
}


$var = ("xhello");
$out = '';
for ($i = 0; $i < mb_strlen($var); $i++) {
    $out .= emoji_to_unicode(mb_substr($var, $i, 1));
}
echo "$out\n";

Output:

U+1F600xU+1F600hello

Demo on 3v4l.org

1
On
function emoji_to_unicode($emoji) {
   $emoji = mb_convert_encoding($emoji, 'UTF-32', 'UTF-8');
   $unicode = strtoupper(preg_replace("/0{3}1/"," U+1",bin2hex($emoji)));
  return $unicode;
}

$var = ("");
echo  emoji_to_unicode($var); // U+1F600 U+1F600

$var = ("");
echo  emoji_to_unicode($var); // U+1F600 U+1F600 U+1F600