How to print a result from multiple root elements in XML using Perl

209 Views Asked by At

I'm starting out with Perl and doing all the bad things you can with the language. This program is using XML::Simple and Regular Expressions which the internet says do not do unless you hate yourself.

Ok, the plan is to read an XML file from a website and read off the package for CentOS 6.6 that needs to be updated. For those unfamiliar with CentOS or the Steve Meier errata XML, it is not nicely arranged with CEBA numbers as shown below...

Example after Parsing

<opt>
    <CEBA-2005--169 description="Not available" from="[email protected]" issue_date="2005-04-07 01:27:35" notes="Not available" product="CentOS Linux" references="http://rhn.redhat.com/errata/RHBA-2005-169.html http://lists.centos.org/pipermail/centos-announce/2005-April/011555.html" release="1" solution="Not available" synopsis="CentOS and up2date - bugfix update" topic="Not available" type="Bug Fix Advisory">
        <os_arch>i386</os_arch>
        <os_arch>x86_64</os_arch>
        <os_release>4</os_release>
        <packages>up2date-4.4.5.6-2.centos4.i386.rpm</packages>
        <packages>up2date-4.4.5.6-2.centos4.src.rpm</packages>
    </CEBA-2005--169>
    <CEBA-2005--842 description="Not available" from="[email protected]" issue_date="2005-11-18 17:52:49" multirelease="1" notes="Not available" product="CentOS Linux" references="https://rhn.redhat.com/errata/RHBA-2005-842.html http://lists.centos.org/pipermail/centos-announce/2005-November/012437.html http://lists.centos.org/pipermail/centos-announce/2005-November/012438.html" release="2" solution="Not available" synopsis="Important CentOS shadow-utils - bugfix update" topic="Not available" type="Bug Fix Advisory">
        <os_arch>i386</os_arch>
        <os_arch>x86_64</os_arch>
        <os_release>4</os_release> 
        <packages>shadow-utils-4.0.3-58.RHEL4.i386.rpm</packages>
    </CEBA-2005--842>

As you can see, the root element of the XML file changes. So I had to use regular expressions to "read" through the file. But when I use my program, it does not print out a result. The problem may be with the regular expressions used or how the element search is written. I am not 100% sure where is the problem and any help is appreciated.

Program

# Script to parse XML file to show updates.

use strict;
use XML::Simple;
use Data::Dumper;
use LWP::Simple;

my $parser = new XML::Simple;

my $url = 'http://cefs.steve-meier.de/errata.latest.xml';
my $content = get $url or die "Unable to get $url \n";
my $list = $parser->XMLin ($content);
my $CEBA = '(CEBA-([\d]+)--([\d]+))';

foreach my $CEBA (@{$list->{/(CEBA-([\d]+)--([\d]+))/}}) {
     if )$CEBS->{os_release eq '6') {
           print $CEBA->{packages} /. "\n";
     }
}
1

There are 1 best solutions below

3
On BEST ANSWER

What you're doing in your foreach there isn't Perl, and you're missing some punctuation in the if. Something like this should work:

#!/usr/bin/perl

use warnings;
use 5.010;

use XML::Simple;
use Data::Dumper;
use LWP::Simple;

# "indirect object" notation (new XML::Simple) is frowned upon
my $parser = XML::Simple->new;

# used this for testing so i wouldn't have to download
# the file for every run.
my $content = do { open my $fh, '<', 'errata.latest.xml' or die; undef $/; <$fh> };
# my $url = 'http://cefs.steve-meier.de/errata.latest.xml';
# my $content = get $url or die "Unable to get $url \n";

my $list = $parser->XMLin($content);
# print Dumper($list);

for my $CEBA (keys %$list) {
    # if the key doesn't match what you want
    # and os_release != 6, then skip to the
    # next entry.
    next unless $CEBA =~ /\ACEBA-\d+--\d+\z/
            and $list->{$CEBA}{os_release} == 6;

    say for @{ $list->{$CEBA}{packages} };
    ## ^-- essentially the same as --v
    # for my $pkg (@{ $list->{$CEBA}{packages} }) {
    #     print "$pkg\n";
    # }
}

That said, even the XML::Simple documentation says not to use it. You'll probably run into problems with this when a CEBA entry only has one package tag in it.