Regex appending extra string in the start and end of resulting string

43 Views Asked by At

I am having string which is in the form of

^COBUYE\rtlch\fcs1 \af39\afs20 \ltrch\fcs0 \fs20\insrsid15797378 \hich\af39\dbch\af31505\loch\f39 R-NAME-FIRST COBUYER-NUMBER="1"\rtlch\fcs1 \af0\afs20 \ltrch\fcs0 \f0\fs20\insrsid10175635\charrsid8585274 \cell 

Now I want to make a string like ^COBUYER-NAME-FIRST COBUYER-NUMBER="1" from this string. For this I tried this regular expression.

\\b(?:\\\\[a-zA-Z]+(-?[0-9]+)? ?)+ ?\\b

When I use this Regex like

String tag = "^COBUYE\\rtlch\\fcs1 \\af39\\afs20 \\ltrch\\fcs0 \\fs20\\insrsid15797378 \\hich\\af39\\dbch\\af31505\\loch\\f39 R-NAME-FIRST COBUYER-NUMBER=\"1\"\\rtlch\\fcs1 \\af0\\afs20 \\ltrch\\fcs0 \\f0\\fs20\\insrsid10175635\\charrsid8585274 \\cell ";
//String tag = "\\rtlch\\fcs1^COBUYE\\rtlch\\fcs1 \\af39\\afs20 \\ltrch\\fcs0 \\fs20\\insrsid15797378 \\hich\\af39\\dbch\\af31505\\loch\\f39 R-NAME-FIRST COBUYER-NUMBER=\"1\"\\rtlch\\fcs1 \\af0\\afs20 \\ltrch\\fcs0 \\f0\\fs20\\insrsid10175635\\charrsid8585274 \\cell ";
String controlWordRegex = "\\b(?:\\\\[a-zA-Z]+(-?[0-9]+)? ?)+ ?\\b";
StringBuffer sb = new StringBuffer();
Pattern controlWordRegexPattern = Pattern.compile(controlWordRegex, Pattern.DOTALL);
Matcher controlWordRegexPatternMatcher = controlWordRegexPattern.matcher(tag);
while (controlWordRegexPatternMatcher.find()) {
    String rtfControlWord = controlWordRegexPatternMatcher.group();
    String replacementText = "";
    controlWordRegexPatternMatcher.appendReplacement(sb, replacementText);
}
controlWordRegexPatternMatcher.appendTail(sb);

Then when comes to condition rtfControlWorld becomes

String rtfControlWord = \rtlch\fcs1 \af39\afs20 \ltrch\fcs0 \fs20\insrsid15797378 \hich\af39\dbch\af31505\loch\f39 

and sb becomes ^COBUYE

Now it again comes to while condition and this time rtfControlWorld becomes

String rtfControlWord = \fcs1 \af0\afs20 \ltrch\fcs0 \f0\fs20\insrsid10175635\charrsid8585274 \cell

and sb becomes ^COBUYER-NAME-FIRST COBUYER-NUMBER="1"\rtlch

Why in second time it is starting control word from \fcs1 instead of \rtlch\fcs1 ? \rtlch is also starting with \? How can I make it to work so I can only get ^COBUYER-NAME-FIRST COBUYER-NUMBER="1" instead of ^COBUYER-NAME-FIRST COBUYER-NUMBER="1"\rtlch?

Similarly in case of second tag I get \rtlch^COBUYER-NAME-FIRST COBUYER-NUMBER="1"\rtlch. How can I correct it ?

Thanks

0

There are 0 best solutions below