python regex to match any valid english sentence

612 Views Asked by swap310 At 27 July 2025 at 23:47

I was wondering if it is possible to write a python regex to match it up with any valid English sentence which can have alphanumeric characters and special characters.
Basically, I wanted to extract some specific elements from an XML file. These specific elements will have the following form:

<p o=<Any Number>> <Any English sentence> </p>

For example:

<p o ="1"> The quick brown fox jumps over the lazy dog </p>

<p o ="2">  And This is a number 12.90! </p>

We can easily write regex for

<p o=<Any Number>>

and </p> tags. But I am interested in extracting the sentences lying in between these tags by writing regex group.

Can anyone please suggest a Regex to be used for the problem above?

Also, if you can suggest a workaround approach, then it will be really helpful to me as well.

Original Q&A

There are 2 best solutions below

Kien Truong On 25 May 2012 at 11:06 BEST ANSWER

Use an XML parser like lxml, regex is not suitable for this task. Example:

import lxml.etree
// First we parse the xml
doc = lxml.etree.fromstring('<p o ="2">  And This is a number 12.90! </p>')
// Then we use xpath to extract the element we need
doc.xpath('/p/text()')

You can read more about XPATH at: Xpath tutorial.

Pete On 25 May 2012 at 11:08

You should use an xml parser really. Example here http://www.travisglines.com/web-coding/python-xml-parser-tutorial.

python regex to match any valid english sentence

There are 2 best solutions below

Related Questions in PYTHON

Related Questions in REGEX

Related Questions in TEXT-MANIPULATION

Trending Questions

Popular # Hahtags

Popular Questions