I have a Perl script that reads a simple .csv
file like below-
"header1","header2","header3","header4"
"12","12-JUL-2012","Active","Processed"
"13","11-JUL-2012","In Process","Pending"
"32","10-JUL-2012","Active","Processed"
"24","08-JUL-2012","Active","Processed"
.....
The aim is to convert this .csv
to an .xml
file something like below-
<ORDERS>
<LIST_G_ROWS>
<G_ROWS>
<header1>12</header1>
<header2>12-JUL-2012</header2>
<header3>Active</header3>
<header4>Processed</header4>
</G_ROWS>
<G_ROWS>
<header1>13</header1>
<header2>11-JUL-2012</header2>
<header3>In Process</header3>
<header4>Pending</header4>
</G_ROWS>
....
....
</LIST_G_ROWS>
</ORDERS>
I know that there is XML::CSV
available in CPAN which will make my life easier but I want to make use of already installed XML::LibXML
to create the XML, instead of installing XML::CSV
. I was able to read the CSV and create the XML file as above without any issues, but I am getting a random order of the elements in the XML i.e. something like below. I need to have the order of the elements (child nodes) to be in sync with the .csv
file as shown above, but I am not quite sure how do go around that. I am using a hash
and sort()
ing the hash didn't quite solve the problem either.
<ORDERS>
<LIST_G_ROWS>
<G_ROWS>
<header3>Active</header3>
<header1>12</header1>
<header4>Processed</header4>
<header2>12-JUL-2012</header2>
</G_ROWS>
......
and so on. Below is the snippet from my perl code
use XML::LibXML;
use strict;
my $outcsv="/path/to/data.csv";
my $$xmlFile="/path/to/data.xml";
my $headers = 0;
my $doc = XML::LibXML::Document->new('1.0', 'UTF-8');
my $root = $doc->createElement("ORDERS");
my $list = $doc->createElement("LIST_G_ROWS");
$root->appendChild($list);
open(IN,"$outcsv") || die "can not open $outcsv: $!\n";
while(<IN>){
chomp($_);
if ($headers == 0)
{
$_ =~ s/^\"//g; #remove starting (")
$_ =~ s/\"$//g; #remove trailing (")
@keys = split(/\",\"/,$_); #split per ","
s{^\s+|\s+$}{}g foreach @keys; #remove leading and trailing spaces from each field
$headers = 1;
}
else{
$_ =~ s/^\"//g; #remove starting (")
$_ =~ s/\"$//g; #remove trailing (")
@vals = split(/\",\"/,$_); #split per ","
s{^\s+|\s+$}{}g foreach @vals; #remove leading and trailing spaces from each field
my %tags = map {$keys[$_] => $vals[$_]} (0..@keys-1);
my $row = $doc->createElement("G_ROWS");
$list->appendChild($row);
for my $name (keys %tags) {
my $tag = $doc->createElement($name);
my $value = $tags{$name};
$tag->appendTextNode($value);
$row->appendChild($tag);
}
}
}
close(IN);
$doc->setDocumentElement($root);
open(OUT,">$xmlFile") || die "can not open $xmlFile: $!\n";
print OUT $doc->toString();
close(OUT);
You could forget the
%tags
hash entirely. Instead, loop over the indices of@keys
:That way, the ordering of your keys is preserved. When a hash is used, the ordering is indeterminate.