Java regex: [^(\\s?-\\s?)]. PatternSyntaxException on 's'?

367 Views Asked by At

I just started learning java regex today so excuse me for taking a rather lazy approach to learning what's wrong with my regex. Basically I'm trying to split a string on 'white space', -, 'white space' pattern (where white space is either one or none) and I'm not sure why the pattern isn't compiling. I get an error on the second s in: [^(\s?-\s?)] (index 8). If someone could help me out, I'd really appreciate it!

2

There are 2 best solutions below

1
On BEST ANSWER

You are placing your pattern you are trying to split on inside of a negated character class in which it is doing the exact opposite of what you are expecting it to do.

[^(\s?-\s?)]  # matches any character except: 

                  # '('
                  # whitespace (\n, \r, \t, \f, and " ") 
                  # '?'
                  # '-'
                  # whitespace (\n, \r, \t, \f, and " ")
                  # '?'
                  # ')'

Your syntax is indeed incorrect, but why will it not compile? Well, Inside of a character class the hyphen has special meaning. You can place a hyphen as the first or last character of the class. In some regular expression implementations, you can also place directly after a range. If you place the hyphen anywhere else you need to escape it in order to add it to your class.

To fix the compile issue, you would simply escape the hyphen, still this regex does not do what you want.

I'm trying to split a string on white space, -, white space pattern...

Remove the character class and the capturing group from the pattern:

String s = "foo - bar - baz-quz";
String[] parts = s.split("\\s?-\\s?");
System.out.println(Arrays.toString(parts)); //=> [foo, bar, baz, quz]

Here are a few references for learning regular expressions.

1
On

\s is a character class. Characters inside [] are treated as those characters. (Or not specific characters in the case of [^]). It doesn't make sense to use \s inside of [].

Perhaps you mean to use parenthesis instead of braces?