After web-scrapping, I get the following:
[<p>xxx<p>, <p>1.apple</p>, <p>aaa</p>, <p>xxxxx</p>, <p>xxxxx</p>, <p>2.orange</p>, <p>aaa</p>, <p>xxxxx</p>,<p>3.banana</p>, <p>aaa</p>, <p>xxxxx</p>]
From the list, "xxxx" are those useless values. I can see the pattern that the result I want is between two substrings. Substring1 = "<p>1" / "<p>2" / "<p>3" ; Substring2 = "</p>, <p>aaa".
Assume this pattern repeats hundreds of times. How do I get the result by python? Many thanks !!
My target result is :
apple
orange
banana
I have tried to use split and tried [sub1:sub2] but it doesn't work
From what I INFER from your question (assuming the words you're looking for follow a beacon of format
<p>number.), a regex would do the job: