In the hope of preventing someone from wasting their time offering an alternative solution I have to use regular expressions for this task.
I am trying to write a regular expression to match a base word that has the prefix "<" (AND OR) the suffix ">" but NOT to match if the base word has neither prefix nor suffix.
This is not a simple case of matching either a "<" or a ">" as this character may change or be part of a group.
Example.
For this example the group of base words are (base|text|word) in real life this list could be quite long.
Out of these candidates in a the input text file...
text
<text
text>
<text>
...I want to match the following...
<text
text>
<text>
...but NOT match...
text
In spoken English my RegEx is looking for any of the base words prefixed with a "<" (AND OR) suffixed with ">" but not to match the base word if it has neither prefix/suffix.
As mentioned above it is not a case of matching a literal "<" or a ">" as these characters may be different or part of a group.
Out of all the attempts I have made I cannot get this to work without catching the base word if it appears alone without a prefix or suffix.
As I became increasingly flustered while working on this problem I failed to retain all my previous attempts. My efforts will be of little value to anyone here as they all failed and when I ran out of ideas I ended up guessing.
The following are some examples.
(text) = This will catch "text"
(\<)(text) = This will catch "<text"
(text)(/>) = This will catch "text>"
(\<)(text)(/>) = This will catch "<text>"
(\<|)(text)(|/>) = This is the closest as it will catch "<text" "text>" "<text>" but it will also catch "text".
I have also experimented with look-around and look-behind but I was not able to look-behind and jump over the base word to see if there was a prefix.
The only workaround is to use 2 RegEx. The first Looks for (\<)(text) and the second looks for (text)(/>) however this means running the RegEx twice which is inefficient and I really want to solve this problem.
I have been provided with a standalone custom executable (windows) to run these RegEx's and I have no idea what RegEx engine it uses but common RegEx commands seem to work ok.
Thank you and any help would be gratefully received.
You can use
See the regex demo.
Details:
(<)?- Group 1 (optional): matches a<optionallytext- matches atextstring(?(1)>?|>)- a conditional construct: if Group 1 matched an optional>char is matched, else, a>must be matched.If you need to use word boundaries, use them like in