Profanity filter to replace each letter of bad words with asterisks

247 Views Asked by At

I had a working function that was getting an array of bad words then replacing the bad words with asterisks.

When I upgraded to PHP7, I had to use preg_replace_callback since the use of preg_replace with the e modifier was deprecated.

This is how I was using it:

function filterwords($text){
    $filterWords = array("dummy");
    $filterCount = sizeof($filterWords);

    for ($i = 0; $i < $filterCount; $i++) {
        $text = preg_replace('/\b' . $filterWords[$i] . '\b/ie', "str_repeat('*', strlen('$0'))", $text);
    }

    return $text;
}

Here is my new code:

echo filterwords("I am a dummy");

function filterwords($text) {
    $filterWords = array("dummy");
    $filterCount = sizeof($filterWords);

    for ($i = 0; $i < $filterCount; $i++) {
        $text = preg_replace_callback('/\b' . $filterWords[$i] . '\b/i',
            function ($matches) {
                return str_repeat('*', strlen('$0'));
            },
            $text
        );  
    }

   return $text;
}

This outputs I am a **, but my desired output is I am a ***** (with 5 asterisks instead of 2).

2

There are 2 best solutions below

0
AbraCadaver On BEST ANSWER

The backreferences used in preg_replace like $0 have no meaning in preg_replace_callback. You are passing the matches into the function as $matches but you are checking strlen('$0') which is just a 2 character string $0 so you get 2 *.

Use $matches and the number of the backreference. Just like you are used to, 0 is the full match:

return str_repeat('*', strlen($matches[0]));
0
mickmackusa On

It is also possible to avoid the loop over the blacklisted words as well as avoid preg_replace_callback() if you use the \G (continue metacharacter). Use a lookahead to target all of the whole words in your array then match the first word character (letters, numbers or underscores), then match each contiguous letter and replace with a single asterisk.

Code: (Demo)

function filterwords(string $text, array $bannedWords)
{
    return preg_replace(
        '/(?=\b(?:' . implode('|', $bannedWords) . ')\b)\w|\G(?!^)\w/iu',
        '*',
        $text
    );
}

echo filterwords("I am a dummy, but I'm not dumb", ['dummy']);
// I am a *****, but I'm not dumb

If this level of regex complexity is too much, then you can still implode the list of banned words and write the callback function using modern "arrow syntax" to reach the same desired result. (Demo)

function filterwords(string $text, array $bannedWords)
{
    return preg_replace_callback(
        '/\b(?:' . implode('|', $bannedWords) . ')\b/iu',
        fn($m) => str_repeat('*', mb_strlen($m[0])),
        $text
    );
}