How do I match a pattern with optional surrounding quotes?

13.7k Views Asked by At

How would one write a regex that matches a pattern that can contain quotes, but if it does, must have matching quotes at the beginning and end?

"?(pattern)"?

Will not work because it will allow patterns that begin with a quote but don't end with one.

"(pattern)"|(pattern)

Will work, but is repetitive. Is there a better way to do that without repeating the pattern?

5

There are 5 best solutions below

7
On BEST ANSWER

You can get a solution without repeating by making use of backreferences and conditionals:

/^(")?(pattern)(?(1)\1|)$/

Matches:

  • pattern
  • "pattern"

Doesn't match:

  • "pattern
  • pattern"

This pattern is somewhat complex, however. It first looks for an optional quote, and puts it into backreference 1 if one is found. Then it searches for your pattern. Then it uses conditional syntax to say "if backreference 1 is found again, match it, otherwise match nothing". The whole pattern is anchored (which means that it needs to appear by itself on a line) so that unmatched quotes won't be captured (otherwise the pattern in pattern" would match).

Note that support for conditionals varies by engine and the more verbose but repetitive expressions will be more widely supported (and likely easier to understand).


Update: A much simpler version of this regex would be /^(")?(pattern)\1$/, which does not need a conditional. When I was testing this initially, the tester I was using gave me a false negative, which lead me to discount it (oops!).

I'll leave the solution with the conditional up for posterity and interest, but this is a simpler version that is more likely to work in a wider variety of engines (backreferences are the only feature being used here which might be unsupported).

0
On

Generally @Daniel Vandersluis response would work. However, some compilers do not recognize the optional group (") if it is empty, therefore they do not detect the back reference \1.

In order to avoid this problem a more robust solution would be:

/^("|)(pattern)\1$/

Then the compiler will always detect the first group. This expression can also be modified if there is some prefix in the expression and you want to capture it first:

/^(key)=("|)(value)\2$/
0
On

Depending on the language you're using, you should be able to use backreferences. Something like this, say:

(["'])(pattern)\1|^(pattern)$

That way, you're requiring that either there are no quotes, or that the SAME quote is used on both ends.

0
On

This should work with recursive regex (which needs longer to get right). In the meantime: in Perl, you can build a self-modifying regex. I'll leave that as an academic example ;-)

my @stuff = ( '"pattern"', 'pattern', 'pattern"', '"pattern'  );

foreach (@stuff) {
   print "$_ OK\n" if /^
                        (")?
                        \w+
                        (??{defined $1 ? '"' : ''})
                       $
                      /x
}

Result:

"pattern" OK
pattern OK
1
On

This is quite simple as well: (".+"|.+). Make sure the first match is with quotes and the second without.