How to limit the search scope without regex lookbehinds?

91 Views Asked by At

Given a regular expression, I can easily decide where to start looking for a match from in a string using lastIndex.
Now, I want to make sure that the match I get doesn't go past a certain point in the string.

I would happily enclose the regular expression in a non-capturing group and append, for instance, (?<=^.{0,8}).

But how can I achieve the same goal without lookbehinds, that still aren't globally supported?

Note:

  • While it might be the only reasonable fallback, slicing the string is not a good option as it results in a loss of context for the search.

Example

https://regex101.com/r/7bWtSW/1

with the base regular expression that:

  • matches the letter 'a', at least once and as many times as possible
  • as long as an 'X' comes later

We can see that we can achieve our goal with a lookbehind: we still get a match, shorter.
However, if we sliced the string, we would lose the match (because the lookahead in the base regular expression would fail).

2

There are 2 best solutions below

1
The fourth bird On

Your pattern in the regex demo (?:a+(?=.*X))(?<=^.{0,4}) uses a lookbehind assertion with that can yield multiple separate matches.

See a regex demo for the same pattern with multiple matches in the same string

Without using a lookbehind, you can not get those separate matches.

What you might do is use an extra step to get all the matches for consecutive a char over matched part that fulfills the length restriction (In this case the group 1 value)

^([^\nX]{0,3}a)[^\nX]*X

The pattern matches

  • ^ Start of string
  • ( Capture group 1
    • [^\nX]{0,3}a Match 0-3 times a char other than a newline or X and then match a
  • ) Close group 1
  • [^\nX]*X Match optional chars other than a newline or X and then match X

Regex demo

const regex = /^([^\nX]{0,3}a)[^\nX]*X/;
[
  "aaaaaaaaX",
  "baaaaaaaaX",
  "bbaaaaaaaaX",
  "bbbaaaaaaaaX",
  "bbbbaaaaaaaaX",
  "babaaaaaaaaX",
  "aX",
  "abaaX"
].forEach(s => {
  const m = s.match(regex);
  if (m) {
    console.log(m[1].match(/a+/g))
  }
})

5
Brother58697 On

Slice the match instead of slicing the string.

In your example, you want the match to account for the positive lookahead for X. But X is outside the limited scope, so we don't want to limit the search scope, essentially slicing the string, instead we want to limit match length relative to its position in the string.

To do that we'll use the index property of the returned match array.

const string = 'aaaaaaaX'
const regex = /a+(?=X)/

function limitedMatch(string, regex, lastIndex) {
  const match = string.match(regex)
  const {index} = match;
  const matchLength = Math.max(lastIndex - index,0)
  return match[0].slice(0, matchLength)
}

console.log(limitedMatch(string, regex, 4))
console.log(limitedMatch(string, regex, 2))