re.findall giving different results to re.compile.regex

70 Views Asked by Will1v At 20 February 2024 at 06:51

Why does re.compile.findall not find "um" if "um" is at the beginning of the string (it works fine is "um" isn't at the beginning of the string, as per the last 2 lines below)

>>> s = "um"
>>> re.findall(r"\bum\b", s, re.IGNORECASE)
['um']
>>> re.compile(r"\bum\b").findall(s, re.IGNORECASE)
[]
>>> re.compile(r"\bum\b").findall(s + " foobar", re.IGNORECASE)
[]
>>> re.compile(r"\bum\b").findall("foobar " + s, re.IGNORECASE)
['um']

I would have expected the two options to be identical. What am I missing?

Original Q&A

There are 1 best solutions below

Tim Peters On 20 February 2024 at 07:11 BEST ANSWER

You intended to pass re.IGNORECASE to the compile() function, but in the failing cases you're actually passing it to the findall() method. There it's interpreted as an integer giving the starting position for the search to begin. Its value as an integer isn't defined, but happens to be 2 today:

>>> int(re.IGNORECASE)
2

Rewrite the code to work as intended, and it's fine; for example:

>>> re.compile(r"\bum\b", re.IGNORECASE).findall(s + " foobar") # pass to compile()
['um']

As originally written, it can't work unless "um" starts at or after position 2:

>>> re.compile(r"\bum\b").findall(" " + s, re.IGNORECASE)
[]
>>> re.compile(r"\bum\b").findall("  " + s, re.IGNORECASE) # starts at 2
['um']

re.findall giving different results to re.compile.regex

There are 1 best solutions below

Related Questions in PYTHON

Related Questions in COMPILATION

Related Questions in PYTHON-RE

Related Questions in FINDALL

Trending Questions

Popular # Hahtags

Popular Questions