Why does re.compile.findall not find "um" if "um" is at the beginning of the string (it works fine is "um" isn't at the beginning of the string, as per the last 2 lines below)
>>> s = "um"
>>> re.findall(r"\bum\b", s, re.IGNORECASE)
['um']
>>> re.compile(r"\bum\b").findall(s, re.IGNORECASE)
[]
>>> re.compile(r"\bum\b").findall(s + " foobar", re.IGNORECASE)
[]
>>> re.compile(r"\bum\b").findall("foobar " + s, re.IGNORECASE)
['um']
I would have expected the two options to be identical. What am I missing?
You intended to pass
re.IGNORECASEto thecompile()function, but in the failing cases you're actually passing it to thefindall()method. There it's interpreted as an integer giving the starting position for the search to begin. Its value as an integer isn't defined, but happens to be 2 today:Rewrite the code to work as intended, and it's fine; for example:
As originally written, it can't work unless "um" starts at or after position 2: