Bad word check in PHP using stripos

651 Views Asked by At

I implemented this "bad word" check function in php:

# bad word detector
function check_badwords($string) {
    $badwords = array(a number of words some may find inappropriate for SE);
    foreach($badwords as $item) {
        if(stripos($string, $item) !== false) return true;
    }
    return false;
}

It works alright, except I'm having a little problem. If the $string is:

Who is the best guitarist ever?

...it returns true, because there is a match with Who ($string) and ho (in $badwords array). How could the function be modified so that it only checks for complete words, and not just part of words?

  • check_badwords('She is a ho'); //should return true
  • check_badwords('Who is she?'); //should return false

Thanks!

3

There are 3 best solutions below

1
On BEST ANSWER

In order to check for complete words you should use regular expressions:

function check_badwords($string)
{
    $badwords = array(/* the big list of words here */);
    // Create the regex
    $re = '/\b('.implode('|', $badwords).')\b/';
    // Check if it matches the sentence
    return preg_match($re, $string);
}

How the regex works

The regular expression starts and ends with the special sequence \b that matches a word boundary (i.e. when a word character is followed by a non-word character or viceversa; the word characters are the letters, the digits and the underscore).

Between the two word boundaries there is a subpattern that contains all the bad words separated by |. The subpattern matches any of the bad words.

If you want to know what bad word was found you can change the function:

function check_badwords($string)
{
    $badwords = array(/* the big list of words here */);
    $re = '/\b('.implode('|', $badwords).')\b/';
    // Check for matches, save the first match in $match
    $result = preg_match($re, $string, $match);
    // if $result is TRUE then $match[1] contains the first bad word found in $string
   return $result;
}
3
On

You probably would like to replace stripos with preg_match

if you can make it a better regex, more power to you:

preg_match("/\s($string){1}\s/", $input_line, $output_array);
0
On

You can even lowercase the $string and then instead using stripos or even a regular expression, just use in_array(). That'd match against the whole word.