Python: Replacing Text Split Across HTML Tags

60 Views Asked by At

I'm working on a Python function to search for and replace a string within an HTML document, where the string might be broken up by HTML tags. I need a solution that accurately handles these cases without disrupting the HTML structure.

For instance, I want to replace "hello world" with "hi there" in HTML snippets like:

<p>This is some HTML with hello <strong>world</strong> in it.</p>

Expected output:

<p>This is some HTML with hi <strong>there</strong> in it.</p>

And more complex cases like:

<p>This is some HTML with hello <span class="someClass">w</span>orld in it.</p>

Expected output (flexible with the placement of span):

<p>This is some HTML with hi <span class="someClass">there</span> in it.</p>

I've explored using BeautifulSoup and regular expressions, but so far I haven't successfully handled text split across different tags. I came across a JavaScript library findAndReplaceDOMText (https://github.com/padolsey/findAndReplaceDOMText) that achieves something similar to what I need. Is there a Python equivalent, or a way to combine existing Python tools to accomplish this?

0

There are 0 best solutions below