Firstly, apologies; I'm fairly new to the world of RegEx.
Secondly (more of an FYI), I'm using an application that only has RegEx Replace functionality, therefore I'm potentially going to be limited on what can/can't be achieved.
The Challange
I have a free text field (labelled Description
) that primarily contains "useless" text. However, some records will contain either one or multiple IDs that are useful and I would like to extract said IDs.
Every ID will have the same three-letter prefix (APP
) followed by a five digit numeric value (e.g. 12911
).
For example, I have the following string in my Description
Field;
APP00001Was APP00002TEST APP00003Blah blah APP00004 Apple APP11112OrANGE APP
THE JOURNEY
I've managed to very crudely put together an expression that is close to what I need (although, I actually need the reverse);
/!?APP\d{1,5}/g
Result;
THE STRUGGLE
However, on the Replace, I'm only able to retain the non-matched values;
Was TEST Blah blah Apple OrANGE APP
THE ENDGAME
I would like the output to be;
APP00001 APP00002 APP00003 APP00004 APP11112
Apologies once again if this is somewhat of a 'noddy' question; but any help would be much appreciated and all ideas welcome.
Many thanks in advance.
You could use an alternation
|
to capture either the pattern starting with a word boundary in group 1 or match 1+ word chars followed by optional whitespace chars.What you capture in group 1 can be used as the replacement. The matches will not be in the replacement.
Using
!?
matches an optional exclamation mark. You could prepend that to the pattern, but it is not part of the example data.See a regex demo
In the replacement use capture group 1, mostly using
$1
or\1