I'm parsing through a file - first thing I do is concatenate the first three fields and prepend them to each record. Then I want to scrub the data of any colons, single quotes, double quotes or backslashes. Following is how I'm doing it, but is there a way for me to do it using the $line variable that would be more efficient?
# Read the lines one by one.
while($line = <$FH>) {
# split the fields, concatenate the first three fields,
# and add it to the beginning of each line in the file
chomp($line);
my @fields = split(/,/, $line);
unshift @fields, join '_', @fields[0..2];
# Scrub data of characters that cause scripting problems down the line.
$_ =~ s/:/ /g for @fields[0..39];
$_ =~ s/\'/ /g for @fields[0..39];
$_ =~ s/"/ /g for @fields[0..39];
$_ =~ s/\\/ /g for @fields[0..39];
I am certain that I have seen a very similar question before but my simple searches won't find it. What stands out is adding a new field before all of the rest that is a function of the original values
You've described that best in Perl terms
so the only step left is the removal of rogue characters—single and double quotes, colons, and backslashes
Your code seems to work fine. The only changes I would make would be
Use the default variable
$_
properly. I think this is what newcomers hate most about Perl, and then come to love most once they understand itUse
tr///d
instead ofs///
. It may add a little speed, but most of all frees you from regex syntax when you just want to say what characters to delete and need something simplerI think this should do what you need
output