PHP Regex exclude comments finding error suppression

113 Views Asked by At

I'm trying to do a regex to look through a pre-existing code base that seems to abuse the hell out of the php error suppression character (@) on both variable references and function calls. As a result, I want to search through the entire code base to create a list of all the usages. The problem is, much of the code also include perldoc and I'm not sure how to exclude obvious comments.

most of the perldoc seems to be predicated by a minimum of whitespace-asterix-whitespace. e.g.:

  /**
   * @param int $somvar
   */

so it could be matched with something like /^\s*\*\s+/ reasonably consistently.

The regex I'm using to find the usages of the error suppression character (but that grabs the perldoc also) is:

/(@[\$\w][\w\d]*)/

It's results are satisfactory save for picking up all the perldoc.

I tried looking at some of the examples of negative look-ahead, but don't seem to be evading those perldoc comments with anything I've yet tried. One example of one that doesn't work is as follows:

(?!\s*[\*\/])(@[\$\w][\w\d]*)

Any help is appreciated

1

There are 1 best solutions below

0
On

You can use PHP's token_get_all() to find all of the @ symbols instead of regex. This way you're letting PHP's own internal parser parse the file for you:

$source_file = 'source_file_to_open.php';
$source = file_get_contents($source_file);
$tokens = token_get_all($source);

// Loop through all the tokens
for ($i=0; $i < count($tokens); $i++) {
    // If the token is equal to @, then get the line number (3rd value in array)
    // of the *following* token because the @ does not have a line number because
    // it's not listed as an array, just a string.
    if ($tokens[$i] == '@') {
        echo "@ found in $source_file on line: {$tokens[$i+1][2]}<br />\n";
    }
}