I'm trying to use a PCRE regex to match the following list of words:
- Milk
- Egg
Out of the following strings:
milk, goatmilk, goat milk, cow milk, watch out for ( milk, eggs), egg, cornstarch
milk. goatmilk. goat milk. cow milk. watch out for ( milk, eggs). egg. cornstarch
milk goatmilk goat milk cow milk watch out for ( milk, eggs). egg cornstarch
This would be an easy excersise but sadly it cannot match any of these words:
- goatmilk
- goat milk
In the above case the string should match because of the words:
- milk
- egg
- eggs
But if the string does not contain any of those words is should not match, i.e.:
sugar, wheat, goatmilk, goat milk, cornstarch
I've tried to apply these but without any succces:
- Regex match these words, but exclude matches with these
- Regex to match a pattern, but exclude a set of words
- Regex to match all words except a given list
The closest regex I got from the resources above was:
\b(?!(?:goatmilk|goat\smilk))(egg|milk)\b
This will still match all the words milk and worse it will skip the word eggs because of the word boundries. If I remove the word boundry it will also match goatmilk..
I already thought of the possibility to use two regular expressions, one to match all words and the other to check the matched words for excluded words. However; this would work perfectly if not for the space between goat and milk as the goat part would not be in the match.
If there is no option to do this I'll use PHP to explode on space, walk through the array and if a match has been found a previous index value will be checked to see if the combination contains a word to exclude to mitigate the space issue. However; I would rather not use it as I believe this option is quite ugly :(
If you have to just avoid returning
milk
that is part ofgoatmilk
orgoat milk
, you can use(*SKIP)(*FAIL)
regex:See the regex demo
The
\bgoat\s*milk\b(*SKIP)(*FAIL)
branch will matchgoatmilk
orgoat milk
and will discard the match due to these 2 PCRE verbs.\b(?:eggs?|milk)\b
branch will return the otheregg
,eggs
andmilk
matches as whole words.