What's the regex for restricted number of numeric characters?

256 Views Asked by At

Having trouble figuring out a regex issue.

We are looking for 2 numbers then hyphen or space then 6 numbers. Must be only 6 numbers, so either an alpha character or some punctuation or space must follow the 6 numbers or the 6 numbers must be at the end of the string.

Other numbers are allowed elsewhere in the string, as long as they are separate.

So, these should match:

foo 12-123456 bar  
12-123456 bar  
foo 12-123456  
foo12-123456bar  
12-123456bar  
foo12-123456  
12-123456bar 99
foo12-123456 99 

These should not match:

123-12345 bar  
foo 12-1234567  
123-12345bar  
foo12-1234567  

Here's what we were using:

\D\d{2}[-|/\ ]\d{6}\D

and in Expresso this was fine.

But running for real in our .net application this pattern was failing to match on examples where the 6 numbers were at the end of the string.

Tried this:

\D\d{2}[-|/\ ]\d{6}[\D|$]

and it still doesn't match

foo 12-123456
3

There are 3 best solutions below

1
On BEST ANSWER

I would restate your pattern from

Must be only 6 numbers, so either an alpha character or some punctuation or space must follow the 6 numbers or the 6 numbers must be at the end of the string.

to

Must be only 6 numbers, so there must not be a number after the sixth number

and then use a negative look-ahead assertion to express this. Similarly, at the start of the pattern use a negative look-behind assertion to say that whatever is before the first two digits, it isn't a digit. Together:

var regex = new Regex(@"(?<!\d)\d{2}[- ]\d{6}(?!\d)");

var testCases = new[]
                    {
                        "foo 12-123456 bar",
                        "12-123456 bar",
                        "foo 12-123456",
                        "foo12-123456bar",
                        "12-123456bar",
                        "foo12-123456",
                        "123-12345 bar",
                        "foo 12-1234567",
                        "123-12345bar",
                        "foo12-1234567",
                    };

foreach (var testCase in testCases)
{
    Console.WriteLine("{0} {1}", regex.IsMatch(testCase), testCase);
}

This produces six Trues then four Falses, as required.

The assertions (?<!\d) and (?!\d) respectively say 'there isn't a digit just before here' and 'there isn't a digit just after here'.

1
On

Ok, based on your further edited question the answer looks like this: ^(?:.*?\D+?)?(\d{2}[-|/\ ]\d{6})(?:\D+?.*?)?$ it matches all the string and captures the number.

upd: added some code to test ans show the matching.

MessageBox.Show(Regex.Replace(
  "1 foo 12-123456 bar\r\n12-123456 bar\r\nfo23o 12-123456\r\nfoo12-123456bar3\r\n" +
  "12-123456bar\r\nfoo12-123456\r\n\r\nThese should not match:\r\n\r\n" +
  "123-12345 bar\r\nfoo 12-1234567\r\n123-12345bar\r\nfoo12-1234567",
  @"^(?:.*?\D+?)?(\d{2}[-|/\ ]\d{6})(?:\D+?.*?)?$",
  @"[match, cap: '$1']",
  RegexOptions.Multiline
));
1
On

This should do it:

(^|\D)\d{2}[- ]\d{6}($|\D)

It looks for either the beginning of the line or a non-number, then your mentioned pattern of either 2-6 or 2 6, then either the end of the line or another non-number.

Edited and tested with perl, matches the first 8, not the next 4.

C# may have other specifica for RegEx. I'm not sure if or which changes are necessary.