RegEx question: standardization of medical terms

67 Views Asked by At

I need to detect words as 'bot/hersen/levermetastase' and transform them into 'botmetastase, hersenmetastase, levermetastase'. But also 'lever/botmetastase' into 'levermetastase, botmetastase'.

So I need to be sure the "word/word/word metastase" is as variabele as possible in numbers.

This is my solution but it doesn't work.

FILTERIN:

\b(\w)\s*[\/]\s*(\w)\s*(metastase)\b 

FILTEROUT:

$1metastase, $2metastase, $3metastase
1

There are 1 best solutions below

1
On BEST ANSWER

You may use

/?(\w+)(?=(?:/\w+)+metastase\b)/?

Replace with $1metastase (with space at the end).

If there can be spaces around the slashes, use

/?\s*(\w+)(?=(?:\s*/\s*\w+)+metastase\b)(?:\s*/)?
/?\h*(\w+)(?=(?:\h*/\h*\w+)+metastase\b)(?:\h*/)?

where \h matches a horizontal only whitespace char, and \s will match any whitespace char.

See the regex demo #1 and regex demo #2.

Details

  • /? - an optional / char
  • (\w+) - Group 1: one or more word chars
  • (?=(?:/\w+)+metastase\b) - that must be followed with
    • (?:/\w+)+ - one or more occurrences of / and then 1+ word chars
    • metastase\b - and metastase whole word (\b is a word boundary)
  • /? - an optional / char.