Regex: selecting end of line character of lines NOT containing 3 semicolons

617 Views Asked by At

I'm using the Regex enabled Search&Replace function in EditpadLite. My document looks like this

20-10-2011;foo1;foo2;foo3;foo4;foo5
19-10-2011;foo1;foo2;foo3;foo4;
18-10-2011;foo1;foo2;foo3;foo4
17-10-2011;foo1;foo2;foo3;foo4;foo5
16-10-2011;foo1;foo2;foo3;foo4;
15-10-2011;foo1;foo2;foo3;foo4

The problem; each line should contain 4 ; symbols, so line 3 and 6 require an additional semicolon at the end of the line, by replacing \n with ;\n. I've tried:

(?<!^.*;{3}.*$)\n

to select the end of line characters not preceded by a line containing exactly 3 semicolons. This doesn't work however, because, I think, the semicolons are not consecutive. Is there an alternative for this?

2

There are 2 best solutions below

0
On BEST ANSWER
(^(?:[^;]+;){4}[^;]+$) 

should match only line 3 and 6

just replace match with $1;

(  //start of group 1
  ^  //start of string
    (  //start of group 2
      ?:  //dont capture matches in group 2
      [^;]+;  //match one or more 'not semicolon' characters followed by a semicolon   
    ){4} //end of group 2, match group 2 exactly 4 times
    [^;]+  //after group 2 matched 4 times there should be one or more 'not semicolon' characters
  $ //end of string
) //end of group 1
1
On

I'd use split and count the number of elements.

Here is a perl way to do it:

#!/usr/local/bin/perl 
use strict;
use warnings;

while(<DATA>) {
    chomp;
    my @l = split /;/;
    $_ .= ";" if @l == 5 && !/;$/;
    print "$_\n";
}

__DATA__
20-10-2011;foo1;foo2;foo3;foo4;foo5
19-10-2011;foo1;foo2;foo3;foo4;
18-10-2011;foo1;foo2;foo3;foo4
17-10-2011;foo1;foo2;foo3;foo4;foo5
16-10-2011;foo1;foo2;foo3;foo4;
15-10-2011;foo1;foo2;foo3;foo4

output:

20-10-2011;foo1;foo2;foo3;foo4;foo5
19-10-2011;foo1;foo2;foo3;foo4;
18-10-2011;foo1;foo2;foo3;foo4;
17-10-2011;foo1;foo2;foo3;foo4;foo5
16-10-2011;foo1;foo2;foo3;foo4;
15-10-2011;foo1;foo2;foo3;foo4;