Using a backreference to a group captured in positive lookbehind and inserting the group in a positive lookforward in regex?

29 Views Asked by Polishko At 23 March 2023 at 07:44

I would like to match a text that is surrounded by one of two allowed symbols (let's say & and #). Whichever of the two symbols is used before the text, should follow after the text; the second symbol option is not allowed (eg. &Time& and #Time# are valid but &Time# is not). I would like to try using lookbehind and lookforward for this by capturing the first symbol in a group. But when I try to do this, the lookbehind and lookahead parts are also included in the match. Is it possible to extract just the text using lookbehind and lookahead with backreference?

r"(?<=(&|#))([A-Za-z]+)(?=(\1))" matches all the string &Hawai&#Rome# instead of extracting Hawai and Rome

Original Q&A

There are 1 best solutions below

JvdV On 23 March 2023 at 08:00

In your current pattern you are using a 3rd, unnecessary, capture group. You could use (?<=[$#])([A-Za-z]+)(?=\1).

However, since findall() would return all capture groups within Python, I think you might as well just scratch the lookarounds and reference the 2nd capture group using a list comprehension like so:

([&#])([A-Za-z]+)\1

See an online demo. In code:

import re
s = '&Hawai&#Rome#'
l = [x[1] for x in re.findall(r'([&#])([A-Za-z]+)\1', s)]
print(l)

Prints:

['Hawai', 'Rome']

Using a backreference to a group captured in positive lookbehind and inserting the group in a positive lookforward in regex?

There are 1 best solutions below

Related Questions in REGEX

Related Questions in BACKREFERENCE

Related Questions in POSITIVE-LOOKAHEAD

Related Questions in POSITIVE-LOOKBEHIND

Trending Questions

Popular # Hahtags

Popular Questions