Python: Convert quote in HTML content not HTML tags

1.1k Views Asked by At

I have a piece of HTML like this:

<pre class="script">template("main/GlobalShared");</pre>
<pre class="script">
var link = '/Draft/Tracker_1.1';
if (wiki.pageexists(link)) {
    &lt;div class="version"&gt; web.link(wiki.uri(link), 'Version 1.1') &lt;/div&gt;
}
</pre>

I need to convert it like this:

<pre class="script">template(&quot;main/GlobalShared&quot;);</pre>
<pre class="script">
var link = '/Draft/Tracker_1.1';
if (wiki.pageexists(link)) {
    &lt;div class=&quot;version&quot;&gt; web.link(wiki.uri(link), 'Version 1.1') &lt;/div&gt; 
}
</pre>

I have been fiddling with regular expressions but I can't seem to get even close. I think my choice is completely wrong.

Can anyone point me in the right direction if this is even possible?

1

There are 1 best solutions below

8
On BEST ANSWER

Use a HTML parser instead, then simply replace the quotes with .replace('"', '&quot;').

BeautifulSoup makes this task easy:

from bs4 import BeautifulSoup

soup = BeautifulSoup(htmlsource)

for string in soup.strings:
     string.replace_with(string.replace('"', '&quot;'))

htmlsource = str(soup)