How to use Nikic/PHP-parser to extract HTML tags from PHP files

151 Views Asked by At

Now I need a Visitor to extract the HTML tag from the PHP file. At first, I thought I would just extract the Value of a Node of type InLineHTML. Just like this

if ($node instanceof Node\Stmt\InlineHTML)
        {
            self::$result[] = json_encode(utf8_encode($node->value),JSON_UNESCAPED_SLASHES);
        }

But I found that only HTML tags outside of PHP tags are considered InLineHTML nodes. Just like this

<?php
   echo "aaa";
?>
<input name='test' value=''>

But then I discovered that I could use echo to output HTML tags in PHP code, and that they would be treated as String_ Node.

<?php
    echo "<input name='test' value='aaa'>";
/*
1: Stmt_Echo(
        exprs: array(
            0: Scalar_String(
                value: <input name='test' value='aaa'>
            )
        )
    )
*/

or

<?php
    return '<tr class=tr1><td class=td1 width='.$l.'% align=right>';

/*
0: Stmt_Return(
                expr: Expr_BinaryOp_Concat(
                    left: Expr_BinaryOp_Concat(
                        left: Scalar_String(
                            value: <tr class=tr1><td class=td1 width=
                        )
                        right: Expr_Variable(
                            name: l
                        )
                    )
                    right: Scalar_String(
                        value: % align=right>
                    )
                )
            )
*/

In this case, using judgment InLineHTML for judgment is not accurate. For complete tags in String_, I can use regular expressions to match. But if the tag is Concat, like '<input'.'>', then this becomes difficult.

How do I use a single Visitor to do the HTML tag extraction

0

There are 0 best solutions below