RegExp match a word NOT spelled with the case specified (like \bOK\b but would only match oK, Ok or ok)

108 Views Asked by At

I would need to match words/abbreviations that are NOT written in the "standard" way, whatever it might be.

Should work similarly than \bOK\b to match in this manner:

Match here -> Ok
Match here -> oK
Match here -> ok
Would not match here -> OK

I have lot of words that should be checked this way and if it is longer than couple character adding check for all permutations of different cases are very large large, very quick.

So looking for more elegant solution instead of something making regexps for all permutations or common wrong casing like this:

(?-i)\bSaas\b
(?-i)\bsAAS\b
(?-i)\bSAAS\b
...

Instead of just matching a word with case different that "SaaS" that I would like to find (kind of misspelled) non standard case

2

There are 2 best solutions below

1
Christian Legge On

This is not something that regex alone is useful to solve. You should just do a second check to see whether it matches the thing you don't want.

// pseudocode
let pattern = "(SaaS)"

let match = regex.match(pattern, str, caseSensitive=false)

if (match && match.group(1) != "SaaS") {
    // the pattern matched a different way of spelling
}
6
Cary Swoveland On

Since PCRE is supported, you can use the regular expression

\b(?!OK)(?i)OK\b

Demo

This works because (?i) does not takes effect until the negative lookahead has been executed (something I was unaware of until I tried it).


The regular expression can be broken down as follows.

\b      match a word boundary
(?!OK)  a negative lookahead asserts that the next two characters not 'OK'
(?i)    match the remainder of the pattern with the case-indifferent flag set 
OK      match the literal 'OK', 'ok', 'Ok' or 'oK'
\b      match a word boundary

Similarly, you could write

\b(?!SaaS)(?i)saas\b

which of course could instead be written

\b(?!SaaS)(?i)SaaS\b

or

\b(?!SaaS)(?i)sAAs\b

and so on.

Demo