I need to match following statements:
Hi there John
Hi there John Doe (jdo)
Without matching these:
Hi there John Doe is here
Hi there John is here
So I figured that this regexp would work:
^Hi there (.*)(?! is here)$
But it does not - and I am not sure why - I believe this may be caused by the capturing group (.*) so i thought that maybe making * operator lazy would solve the problem... but no. This regexp doesn't work too:
^Hi there (.*?)(?! is here)$
Can anyone point me in the solutions direction?
Solution
To retrieve sentence without is here
at the end (like Hi there John Doe (the second)
) you should use (author @Thorbear):
^Hi there (.*$)(?<! is here)
And for sentence that contains some data in the middle (like Hi there John Doe (the second) is here
, John Doe (the second) being the desired data)simple grouping would suffice:
^Hi there (.*?) is here$
.
╔══════════════════════════════════════════╗
║▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒║
║▒▒▒Everyone, thank you for your replies▒▒▒║
║▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒║
╚══════════════════════════════════════════╝
the
.*
will find a match regardless of being greedy, because at the end of the line, there is no followingis here
(naturally).A solution to this could be to use lookbehind instead (checking from the end of the line, if the past couple of characters matches with
is here
).^Hi there (.*)(?<! is here)$
Edit
As suggested by Alan Moore, further changing the pattern to
^Hi there (.*$)(?<! is here)
will increase the performance of the pattern because the capturing group will then gobble up the rest of the string before attempting the lookbehind, thus saving you of unnecessary backtracking.