Validate string to contain only qualifying characters and a specific optional substring in the middle

713 Views Asked by At

I'm trying to make a regular expression in PHP. I can get it working in other languages but not working with PHP.

I want to validate item names in an array

  • They can contain upper and lower case letters, numbers, underscores, and hyphens.
  • They can contain => as an exact string, not separate characters.
  • They cannot start with =>.
  • They cannot finish with =>.

My current code:

$regex = '/^[a-zA-Z0-9-_]+$/';    // contains A-Z a-z 0-9 - _
//$regex = '([^=>]$)';  // doesn't end with =>
//$regex = '~.=>~';  // doesn't start  with =>

if (preg_match($regex, 'Field_name_true2')) {
    echo 'true';
} else {
    echo 'false';
};
// Field=>Value-True
// =>False_name
//Bad_name_2=>
4

There are 4 best solutions below

1
Barmar On BEST ANSWER

Use negative lookarounds. Negative lookahead (?!=>) at the beginning to prohibit beginning with =>, and negative lookbehind (?<!=>) at the end to prohibit ending with =>.

^(?!=>)(?:[a-zA-Z0-9-_]+(=>)?)+(?<!=>)$

DEMO

3
The fourth bird On

For the example data, you can use

^[a-zA-Z0-9_-]+=>[a-zA-Z0-9_-]+$

The pattern matches:

  • ^ Start of string
  • [a-zA-Z0-9_-]+ Match 1+ times any of the listed ranges or characters (can not start with =>)
  • => Match literally
  • [a-zA-Z0-9_-]+ Match again 1+ times any of the listed ranges or characters
  • $ End of string

Regex demo

If you want to allow for optional spaces:

^\h*[a-zA-Z0-9_-]+\h*=>\h*[a-zA-Z0-9_-]+\h*$

Regex demo

Note that [a-zA-Z0-9_-] can be written as [\w-]

0
Jan On

Well, your character ranges equal to \w, so you could use

^(?!=>)(?:(?!=>$)(?:[-\w]|=>))+$

This construct uses a "tempered greedy token", see a demo on regex101.com.


More shiny, complicated and surely over the top, you could use subroutines as in:

(?(DEFINE)
    (?<chars>[-\w])             # equals to A-Z, a-z, 0-9, _, -
    (?<af>=>)                   # "arrow function"
    (?<item>
        (?!(?&af))              # no af at the beginning
        (?:(?&af)?(?&chars)++)+
        (?!(?&af))              # no af at the end
    )
)
^(?&item)$

See another demo on regex101.com

2
mickmackusa On

There is absolutely no requirement for lookarounds here.

Anchors and an optional group will suffice.

Demo

/^[\w-]+(?:=>[\w-]+)?$/
        ^^^^^^^^^^^^^-- this whole non-capturing group is optional

This allows full strings consisting exclusively of [0-9a-zA-Z-] or split ONCE by =>.

The non-capturing group may occur zero or one time.

In other words, => may occur after one or more [\w-] characters, but if it does occur, it MUST be immediately followed by one or more [\w-] characters until the end of the string.


To cover some of the ambiguity in the question requirements:

  • If foo=>bar=>bam is valid, then use /^[\w-]+(?:=>[\w-]+)*$/ which replaces ? (zero or one) with * (zero or more).

  • If foo=>=>bar is valid then use /^[\w-]+(?:(?:=>)+[\w-]+)*$/ which replaces => (must occur once) with (?:=>)+ (substring must occur one or more times).