Regex for string that do not start with any character of a group and do not contain any of multiple substring

85 Views Asked by At

This might be simple, but I have lost a ridiculous amount of hours trying to get there by myself. I need to have a regex that:

  1. Do not start with '_' or '-'.
  2. Do not end with '_' or '-'.
  3. Do not contain '__' or '--' or '_-' or '-_'.

I spent hours on a regex builder and I can't get there. I know I can find a way for each of those criteria separately but I can't have them together.

Update

I forgot to mention it must also only contain these characters [a-zA-Z0-9_-].

4

There are 4 best solutions below

0
DuesserBaest On BEST ANSWER

Try:

^(?![-_])(?:[a-zA-Z0-9]|[-_](?![-_]))+(?<![_-])$

See: regex101


Explanation

  • ^ start of string from where...
  • (?![-_]) not a - or _ is follwing
  • (?: then a repetition of
    • [a-zA-Z0-9] a character that can come at any time
    • | or
    • [-_] a - or _
    • (?![-_]) as long as its not followed by another - or _
  • )+
  • (?<![_-])$ with the last letter also not beeing - or _
4
Tim Biegeleisen On

We can use the following pattern:

^[^_-](?!.*(?:__|--|_-|-_))(?:.*[^_-])?$

This pattern says to match:

  • ^ from the start of the input
    • [^_-] match any leading character OTHER than underscore or hyphen
    • (?!.*(?:__|--|_-|-_)) assert that __, --, _-, or -_ do not occur
    • (?:.*[^_-])? then match anything so long as it does not end in _ or -
  • $ end of the input

Demo

0
Hao Wu On

According to your description, here's a regex that should work.

^(?![_-]|.*[_-]{2}|.*[_-]$)
^                 # beginning of the string
(?!...|...|...)   # negative lookahead with OR conditions, neither of the following criterias should be matched
[_-]              # 1. starting with `_` or `-`
.*[_-]{2}         # 2. any consecutive `-` or `_` (`_-`, `-_`, `--` or `__`)
.*[_-]$           # 3. ending with `_` or `-`

See the tests

6
Nick On

You can match the strings you are interested in with this regex:

^[^_-](?:[^_-]|[_-][^_-])*$

This matches:

  • ^ : beginning of string
  • [^_-] : a character which is not an _ or -
  • (?:[_-][^_-]|[^_-])* : some number of either:
    • [^_-] : a character which is not an _ or -; or
    • [_-][^_-] : an _ or - which is followed by a character which is not an _ or -;
  • $ : end of string

Note that the second part of the alternation requires a character to follow an - or _, this ensures the string cannot end with - or _.

This will be more efficient than using lookaheads.

Demo on regex101

Note that as pointed out by @HaoWu in the comments, the alternation can be simplified to an optional _ or - followed by a character which is not an _ or - i.e.

^[^_-](?:[_-]?[^\n_-])*$

This is slightly more efficient again.

Demo on regex101