I'm trying to extract tokens that satisfy many conditions out of which, I'm using lookahead to implement the following two conditions:
- The tokens must be either numeric/alphanumeric (i.e, they must have at least one digit). They can contain few special characters like -
'-','/','\','.','_'etc.,
I want to match strings like: 165271, agya678, yah@123, kj*12-
- The tokens can't have consecutive special characters like:
ajh12-&
I don't want to match strings like: ajh12-&, 671%&i^
I'm using a positive lookahead for the first condition: (?=\w*\d\w*) and a negative lookahead for the second condition: (?!=[\_\.\:\;\-\\\/\@\+]{2})
I'm not sure how to combine these two look-ahead conditions.
Any suggestions would be helpful. Thanks in advance.
Edit 1 :
I would like to extract complete tokens that are part of a larger string too (i.e., They may be present in middle of the string).
I would like to match all the tokens in the string:
165271 agya678 yah@123 kj*12-
and none of the tokens (not even a part of a token) in the string: ajh12-& 671%&i^
In order to force the regex to consider the whole string I've also used \b in the above regexs : (?=\b\w*\d\w*\b) and (?!=\b[\_\.\:\;\-\\\/\@\+]{2}\b)
You can use
Regex demo
The negative lookahead
(?=[^\d\n]*\d)matches any char except a digit or a newline use a negated character class, and then match a digit.Note that you also have to add
*and that most characters don't have to be escaped in the character class.Using contrast, you could also turn the first
.*into a negated character class to prevent some backtrackingEdit
Without the anchors, you can use whitespace boundaries to the left
(?<!\S)and to the right(?!\S)Regex demo