PHP \DOMDocument converts > to > and & to &

1.1k Views Asked by At

I send html to \DomDocument and \DomDocument convert all special characters.

how i could say to \DomDocument don't convert our special character between {% ..... %}

{% if &a > 10 %} converted to {% if &a > 10 %}

Input

<!DOCTYPE html>
<body>
    {% if &a > 10 %}
        {% print &a %}
    {% end if %}
<img src="{%# image %}" >
<script>
    if a > 10
</script>
</body>

output

<!DOCTYPE html>
<html><body>
    {% if &amp;a &gt; 10 %}
        {% print &amp;a %}
    {% end if %}
<img src="%7B%# image %%7D" >
<script>
    if a > 10
</script></body></html>

code

$dom = new \DOMDocument('1.0', 'UTF-8');
$content = '<!DOCTYPE html><body>
                    {% if &a > 10 %}
                        {% print &a %}
                    {% end if %}
                <img src="{%# image %}" >
                <script>
                    if a > 10
                </script>
            </body>';
@$dom->loadHTML($content);
echo $dom->saveHTML();
2

There are 2 best solutions below

0
On BEST ANSWER

before send HTML to DOMDocument we should encode special data and after work of Dom ended decode data.

encode code

<?php
$dom = new DomDocument();
$content = '<!DOCTYPE html>
<html><body>
                    {% if &a > 10 %}
                        {% print &a %}
                    {% end if %}
                <img src="{%# image %}"><script>
                    if a > 10
                </script></body></html>';

$tag_start = '(base64';
$tag_end   = ')';
//MWMWMWMWMWMWMWMWMWMWMWMWMWMWMWMWMWMWMWMWMWMWMWMWMWMWMWMWMW
// encode data
$pattern = '/({%[^}]+})/ium';
preg_match_all($pattern, $content, $matches);
foreach($matches[0] as $key => $val){
    $base64 = $tag_start.base64_encode($val).$tag_end;
    $content = str_replace($val, $base64, $content);
}

// echo $content;

$dom->loadHTML($content);
$domContent = $dom->saveHTML();

output

<!DOCTYPE html>
<html><body>
                (base64eyUgaWYgJmEgPiAxMCAlfQ==)
                    (base64eyUgcHJpbnQgJmEgJX0=)
                (base64eyUgZW5kIGlmICV9)
            <img src="(base64eyUjIGltYWdlICV9)"><script>
                if a > 10
            </script></body></html>
1
On

try using htmlspecialchars:

$dom = new DOMDocument('1.0', 'UTF-8');
$content =  htmlspecialchars('<!DOCTYPE html><body>
                    {% if &a > 10 %}
                        {% print &a %}
                    {% end if %}
                <img src="{%# image %}" >
                <script>
                    if a > 10
                </script>
            </body>');
$dom->loadHTML($content);
echo $dom->saveHTML();

OUTPUT:

<!DOCTYPE html><body> {% if &a > 10 %} {% print &a %} {% end if %} <img > src="{%# image %}" > <script> if a > 10 </script> </body>