So I was searching for a proper way in PHP to detect if a string is in the BMP range (Basic Multilingual Plane) but I found nothing. Even mb-check-encoding and mb_detect_encoding do not offer any help in this particular case.
So I wrote my own code
<?php
function is_bmp($string) {
$str_ar = mb_str_split($string);
foreach ($str_ar as $char) {
/*Check if there's any character's code point outside the BMP range*/
if (mb_ord($char) > 0xFFFF)
return false;
}
return true;
}
/*String containing non-BMP Unicode characters*/
$string = 'blah blah';
var_dump(is_bmp($string));
?>
Outputs:
bool(false)
Now my question is:
Is there a better approach? and are there any flaws in it?