How do I modify elements in a Perl array inside a foreach loop?

1.5k Views Asked by At

My goal with this piece of code is to sanitize an array of elements (a list of URL's, some with special characters like %) so that I can eventually compare it to another file of URL's and output which ones match. The list of URL's is from a .csv file with the first field being the URL that I want (with some other entries that I skip over with a quick if() statement).

foreach my $var(@input_1) {
    #Skip anything that doesn't start with http:
    if ((/^[#U]/ ) || !(/^h/)) {
        next;
    }
    #Split the .csv into the relevant field:
    my @fields = split /\s?\|\s?/, $_;
    $var = uri_unescape($fields[0]);
}

My delimiter is a | in the csv. In its current setup, and also when I change the $_ to $var, it only returns blank lines. When I remove the $var declaration at the beginning of the loop and use $_, it will output the URL's in the correct format. But in that case, how can I assign the output to the same element in the array? Would this require a second array to output the value to?

I'm relatively new to perl, so I'm sure there is some stuff that I'm missing. I have no clue at this moment why removing the $var at the foreach declaration breaks the parsing of the @fields line, but removing it and using $_ doesn't. Reading the perlsyn documentation did not help as much as I would have liked. Any help appreciated!

1

There are 1 best solutions below

2
On BEST ANSWER

/^h/ is not bound to anything, so the match happens against $_. If you want to match $var, you have to bind it:

if ($var =~ /^[#U]/ || $var !~ /^h/) {

Using || with two matches could probably be incorporated into a single regular expression with an alternative:

next if $var =~ /^(?: [#U] | [^h] | $ )/x;

i.e. The line has to start with #, U, something else than h, or be empty.

You can populate a new array with the results by using push:

push @results, $var;

Also note that if your data can contain | quoted or escaped (or newlines etc.), you should use Text::CSV instead of split.