I'm working on a Python function to search for and replace a string within an HTML document, where the string might be broken up by HTML tags. I need a solution that accurately handles these cases without disrupting the HTML structure.
For instance, I want to replace "hello world" with "hi there" in HTML snippets like:
<p>This is some HTML with hello <strong>world</strong> in it.</p>
Expected output:
<p>This is some HTML with hi <strong>there</strong> in it.</p>
And more complex cases like:
<p>This is some HTML with hello <span class="someClass">w</span>orld in it.</p>
Expected output (flexible with the placement of span):
<p>This is some HTML with hi <span class="someClass">there</span> in it.</p>
I've explored using BeautifulSoup and regular expressions, but so far I haven't successfully handled text split across different tags. I came across a JavaScript library findAndReplaceDOMText (https://github.com/padolsey/findAndReplaceDOMText) that achieves something similar to what I need. Is there a Python equivalent, or a way to combine existing Python tools to accomplish this?