How do I match an expression where I need to do an or
of another set?
i.e., how do I match something of the format
[
[
[ a | b ] |
[ x | y ]
]
]
where a, b, x and y are strings.
I want to match the phrases like
a
b
x
y
a x
a y
b x
b y
x a
x b
y a
y b
But not the ones like:
a b
x y
z z
I'm trying to use it in Boost Xpressive so I have the option to use either ECMAScript or Perl type regular expressions.
You can do it like this:
There are 3 choices here:
a
,b
,x
,y
(single character)a
orb
comes beforex
ory
, space in between.x
ory
comes beforea
orb
, space in between.I put
[abxy]
behind, just in case when you search, it will search for those in front (the paired up ones) before searching for the single ones. The order is important if you use the regex to search, but it doesn't matter much when you do validation.Another way to write it:
That only works for character, but you can make it works for string. For example, let's say you have 4 strings
s1
,s2
,s3
,s4
:It searches for:
s1
ors2
, may or may not (0 or 1 instance of) followed bys3
ors4
This covers all the cases of
s1
,s2
, etc. (single string),s2 s3
,s3 s2
, etc. (paired up, can reverse the order). The regex above will search for the longer version (paired up) before resorting to the single string, due to the default greedy property of quantifiers.Note that I am using capturing groups
(pattern)
in the regex above, which will record the position of the string that matches thepattern
inside. You can make them non-capturing group(?:pattern)
, if you don't need to refer to the text that matches the pattern. This will save you some clock cycles.(I leave the task of changing capturing group to non-capturing group for the other regex as an exercise. It is as simple as adding
?:
)Searching or Validation?
If you want to find such pattern, then the regex above should work for you.
If you want to validate that the string matches the pattern, you need to use anchors
^
(match beginning of string),$
(match the end of the string) to make sure the string follows the exact format:Note that I surround the regex from the above sections with
()
(capturing group, but I only need grouping here actually). This is because I have an alternation|
inside.Extensibility and Limitations
You can add more strings to either the first group or second group as you like:
However, if you want to increase the number of groups, I suggest that you split the string by spaces and loop through the tokens to find the permutations of the group, rather than using regex.