" first matches the entire string (as " /> " first matches the entire string (as " /> " first matches the entire string (as "/>

Expression steals from previously possessively matched characters

55 Views Asked by At

When using PCRE2 on regex101, group 1 in \G(.*?(?:<[^>]*>|&[^;]*;)*+)*?(micro) used on the string "<test><micro>" first matches the entire string (as expected) but (micro) then goes back and steals from the previously possessively matched content. Is this expected behavior? If so, why does it happen and how do i avoid it?

What i tried:

  • Putting the possessive modifier in different places (the .* at the start needs to be able to surrender characters)
  • changing the regex to \G(.*?(?:<[^>]*>|&[^;]*;)*+)*?(micro)?, so that (micro) does not need to be matched. This succeeded in the test case above but when run against the string "micro", it failed to capture the string in group 2

What i expect:

The regex should match everything up the a "micro" that is neither in <> nor in &; and capture it in group 1, then capture the micro in group 2.

0

There are 0 best solutions below