Why is no PatternSyntaxException thrown for my invalid pattern?

373 Views Asked by At

I want to test whether a regexp is valid in Java 1.8.0_241

public static boolean isRegExpValid(String regExp) {
    try {
        Pattern.compile(regExp);
        return true;
    } catch (PatternSyntaxException e) {
        return false;
    }
}

Here I am testing a correct regexp for a three digit number, and an incorrect regexp.

@Test
public void testValidRegexp() {
    assertTrue(isRegExpValid("\\d{3}"));
}

@Test
public void testInvalidRegexp() {
    assertFalse(isRegExpValid("{3}"));
}

Why is my second test testInvalidRegexp failing? isRegExpValid("{3}") should return false, but returns true.

In Javascript, {3} correctly fails with a Nothing to repeat exception. enter image description here

2

There are 2 best solutions below

0
On BEST ANSWER

It appears to successfully match the empty string:

jshell> Pattern.matches("{3}x", "x")
$1 ==> true

The documentation doesn't appear to define this as a valid use, but it passes where other minor syntax problems would be detected and throw Pattern.error.

To analyse why the implementation accepts this, you'd want to figure out where Pattern#closure can be called from while parsing a pattern.

0
On

{3} is a valid regex in Java but it will only match an empty string.

Here is code demo:

jshell> Matcher m = Pattern.compile("{3}").matcher("A {3} here");
m ==> java.util.regex.Matcher[pattern={3} region=0,10 lastmatch=]

jshell> while (m.find()) { System.out.println( "<" + m.group() + "> | " + m.start() + " | " + m.end()); }

<> | 0 | 0
<> | 1 | 1
<> | 2 | 2
<> | 3 | 3
<> | 4 | 4
<> | 5 | 5
<> | 6 | 6
<> | 7 | 7
<> | 8 | 8
<> | 9 | 9
<> | 10 | 10

As per Pattern Javadoc it defines greedy quantifier:

X{n} X, exactly n times

Here X is just an empty string so what it means is that an empty string exactly 3 times which is still an empty string only.