I'm coding an Edifact Reader. An Edifact file consists of string lines like this:
string row = @"ABC+1+E522017332:101111757+MAX:MUSTERMANN:16890224+9'";
There is a set of rules that describe a valid line like this. The RegEX translation of this rules in this particular case looks like this:
Regex regex = new Regex(@"ABC\+\d{1}([A-Z0-9])?(\:\d{1})?\+[A-Z0-9]{1,12}\:[A-Z0-9]{9}\+[A-Z0-9]{0,45}\:[A-Z0-9]{0,45}\:\d{8}\+\d{1}(\d{4})?(\d{1})?([A-Z0-9]{1,7})?([A-Z0-9]{3})?([A-Z0-9]{15})?\'");
And it works just fine. But I also want to split this string respectively the non-constants in the RegEx. The result should look like this:
ABC
1
null
null
E522017332
101111757
MAX
MUSTERMANN
16890224
9
null
null
null
null
null
How can I do it?
You only have the use the capture groups
(...)
for all the pieces you need:Now the
Groups
are numbered 1...many. But it is quite unreadable... You could give explicit names:Possibly names better than
abc
,digit0
,letterdigit0
:-) Names that explain what the digit/letter is!