see I have a string containing html text, lets called it S.
S = "<b>this is a sentence. and this is one more sentence</b>"
and I want is to convert above S into following text
S = <b>This is a sentence. And this is one more sentence</b>
The problem is that I can convert any text to sentence case using my function but when the text contains html there is no way to tell my function which part is text and which part is html that should be avoided. and therefore when I give S as input to my function it yields incorrect result as following
S = <b>this is a sentence. And this is one more sentence</b>
Because it considered '<' as first character of sentence and so it tried converting '<' into uppercase which is same as '<'.
My question to you folks now is that how to convert text into sentence case in python if text is already encoded in html form ? And I dont wanna loose HTML formating
An overly simplistic approach would be
Obviously, this will not cover every situation, but it will work if the task is constrained well enough. This approach should be adaptable for your function that you already have. Essentially, I believe you need to use a parser to parse the HTML and then manipulate the text values of each html node.
If you are reluctant to use a parser, use a regex. This is likely much more fragile, so you must constraint your inputs much more. Something like this as a start:
You can then just check if the words in the split string starts and ends with < and >