I'm trying to use Perl's negative look-ahead regular expression to exclude certain string from targeted string. Please give me your advice.
I was trying to get strings which do not have -sm, -sp, or -sa.
REGEX:
hostname .+-(?!sm|sp|sa).+
INPUT
hostname 9amnbb-rp01c
hostname 9tlsys-eng-vm-r04-ra01c
hostname 9tlsys-eng-vm-r04-sa01c
hostname 9amnbb-sa01
hostname 9amnbb-aaa-sa01c
Expected Output:
hostname 9amnbb-rp01c - SELECTED
hostname 9tlsys-eng-vm-r04-ra01c - SELECTED
hostname 9tlsys-eng-vm-r04-sa01c
hostname 9amnbb-sa01
hostname 9amnbb-aaa-sa01c
However, I got this actual Output below:
hostname 9amnbb-rp01c - SELECTED
hostname 9tlsys-eng-vm-r04-ra01c - SELECTED
hostname 9tlsys-eng-vm-r04-sa01c - SELECTED
hostname 9amnbb-sa01
hostname 9amnbb-aaa-sa01c - SELECTED
Please help me.
p.s.: I used Regex Coach to visualize my result.
Move the
.+-
inside of the lookahead:Rubular: http://www.rubular.com/r/OuSwOLHhEy
Your current expression is not working properly because when the
.+-
is outside of the lookahead, it can backtrack until the lookahead no longer causes the regex to fail. For example with the stringhostname 9amnbb-aaa-sa01c
and the regexhostname .+-(?!sm|sp|sa).+
, the first.+
would match9amnbb
, the lookahead would seeaa
as the next two characters and continue, and the second.+
woudl matchaaa-sa01c
.An alternative to my current regex would be the following:
This would prevent the backtracking because no
-
can occur after the lookahead, the non-greedy?
is used so that this would work correctly in a multiline global mode.