How to get this regex include overlapping occurrences, but not too much:-?

159 Views Asked by At

I want to get all the occurrences of the pattern '[number]' including their context but I can't.

Here is my code:

import re
text = 'some crap [00][0] some more'
regex = r'\[[0-9]*\]'
regex = '.{0,10}' + regex + '.{0,10}'
occurrences = re.findall(regex, text)
for occ in occurrences print(occ)

What is actually wrong!?

My code works just as I wish in any case except for when there are two [number] blocks with less than 10 characters in between. where my code gives me one result while I'm looking for two. If I set the regex to include the overlapping occurrences then it will give all the results for different context lengths. I can't set the context length specifically because I want to include the occurrences at the beginning and end of the string.

What I actually want:

I prefer a pure regex solution to get me all the occurrences of the mentioned pattern including their context.

If really impossible I'd do fine with a solution that uses the positions and selects a range from the string.

1

There are 1 best solutions below

0
On BEST ANSWER

Read about non-capturing group and negative lookahead.

To fix your issue just change the forth line to:

regex = '(?:(?!' + regex + ').){0,10}' + regex + '(?:(?!' + regex + ').){0,10}'