Here's my problem, i need to update an xml file using another xml
Data.xml :
<?xml version='1.0'?>
<employees>
<employee>
<employeenumber>V0000001</employeenumber>
<name>John Doe</name>
<age>43</age>
<sex>M</sex>
<department>Operations</department>
</employee>
<employee>
<employeenumber>V0000002</employeenumber>
<name>Jane Doe</name>
<age>35</age>
<sex>F</sex>
<department>Operations</department>
</employee>
<employee>
<employeenumber>V0000003</employeenumber>
<name>Jane Doe</name>
<age>35</age>
<sex>F</sex>
<department>Operations</department>
</employee>
<employee>
<employeenumber>V0000004</employeenumber>
<name>Jane Doe</name>
<age>35</age>
<sex>F</sex>
<department>Operations</department>
</employee>
<employee>
<employeenumber>V0000005</employeenumber>
<name>Jane Doe</name>
<age>35</age>
<sex>F</sex>
<department>Operations</department>
</employee>
</employees>
Data2.xml :
<?xml version='1.0'?>
<employees>
<employee>
<employeenumber>V0000002</employeenumber>
<name>Jane Doe</name>
<age>34</age>
<sex>F</sex>
<department>Management</department>
</employee>
<employee>
<employeenumber>V0000004</employeenumber>
<name>Jane Doe</name>
<age>34</age>
<sex>F</sex>
<department>Sales</department>
</employee>
</employees>
So I need to update Data.xml with infos from Data2.xml.
I've written this code, it's working but it takes 6hours to execute, Data.xml being rather large (250mo).
use XML::Twig;
my %soi = ();
open(FILE,">out.txt");
my $diff= XML::Twig->new( pretty_print => 'indented',
twig_handlers =>
{ 'employees/employee' => \&stock_n_purge,}
)
->parsefile( 'data2.xml');
sub stock_n_purge
{
my( $diff, $elt)= @_;
$soi{$elt->first_child ("employeenumber")->text} = "1"; # stock l'element dans un tableau
$diff->print(\*FILE);
printf "Found One";
$diff->purge;# frees the memory
}
my $full= XML::Twig->new( pretty_print => 'indented',
twig_handlers =>
{ 'employees/employee' => \&stock_n_purge2,}
)
->parsefile( 'data.xml');
sub stock_n_purge2
{
my( $diff2, $elt2)= @_;
$diff2->print(\*FILE) unless (exists( $soi{$elt2->first_child ("employeenumber")->text} ));
$diff2->purge; # frees the memory
}
close(FILE);
The employeenumber being unique, i write every element of data2.xml in a new file, and i store every employeenumbers in an array. Then i parse data.xml and write every element unless it exists in the array.
This method is not efficient at all. So instead of re-writing all of data.xml, i would like to delete every element from data.xml which exist in the array(and thus in data2.xml). Then append elements from data2.xml to data.xml
My probleme being i can't find a way to delete an element from an xml file using XML Twig.
Does anybody have any ideas ?
Thanks in advance,
Simon.
From a quick look at your code, it seems to me that you print both files many, many, many times. Indeed you print the entire file for every element you find, when you do
$diff->print
. I haven't really debugged your code, but I suspect you want to useflush
instead ofprint
there. Try it and let us know if things improve.