How to print an empty XML element in perl

439 Views Asked by At

I occasionally have to write simple perl scripts to export data from XML files into CSV files for loading into a database.

I am encountering a problem "print"ing an element that has no value. Instead of just printing nothing, it prints the string "HASH(0x1ca05f8)" (or its siblings).

How do I stop it from doing this?

Below is the code that I am using, and the data that I am using. Thanks, --sw

#use module
use XML::Simple;
use Data::Dumper;

#create object
$xml = new XML::Simple;

#read XML file
$data = $xml->XMLin("$ARGV[0]", ForceArray=>1);

foreach $pr (@{$data->{product}})
  foreach $rv (@{$pr->{reviews}})
    foreach $fr (@{$rv->{fullreview}})
      print "$ARGV[1]", ",";
      print "$ARGV[2]", ",";
      print "$ARGV[3]", ",";
      print "$ARGV[4]", ",";
      print $pr->{"pageid"}->[0], ",";
      print $fr->{"status"}->[0], ",";
      print $fr->{"source"}->[0], ",";
      print $fr->{"createddate"}->[0], ",";
      print $fr->{"overallrating"}->[0], ",";
      print $fr->{"email_address_from_user"}->[0], ",";
      foreach $csg (@{$fr->{confirmstatusgroup}})
        print join(";", @{$csg->{"confirmstatus"}});

      print "\n";


<?xml version="1.0" encoding="UTF-8"?>
<products xmlns:xsi="">
<product xsi:type="ProductWithReviews" locale="en_US">
<confirmstatus>Verified Purchaser</confirmstatus>
<confirmstatus>Verified Reviewer</confirmstatus>

The output this creates:

,,,,bshnbat612,Approved,email,2014-03-28,5,HASH(0xe9fee8),Verified Purchaser;Verified Reviewer

In response to a suggestion made below, here is the Dumper output:

$VAR1 = {
    'xmlns:xsi' => '',
    'product' => [
        'xsi:type' => 'ProductWithReviews',
        'reviews' => [
            'fullreview' => [
                'source' => [
                'email_address_from_user' => [
                    'overallrating' => [
                    'confirmstatusgroup' => [
                        'confirmstatus' => [
                            'Verified Purchaser',
                        'Verified Reviewer'
                        'status' => [
                        'createddate' => [
            'pageid' => [
            'locale' => 'en_US'

There are 2 best solutions below


Take a look at the SuppressEmpty option that can be passed to XML::Simple. Without it, XML::Simple will provide an empty hash for empty elements. By calling XMLin("$ARGV[0]", ForceArray=>1, SuppressEmpty=>1); your output should be: ,,,,bshnbat612,Approved,email,2014-03-28,5,,Verified Purchaser;Verified Reviewer


OK, there's a big hint on the XML::Simple documentation:

The use of this module in new code is discouraged. Other modules are available which provide more straightforward and consistent interfaces. In particular, XML::LibXML is highly recommended.

Personally though, I like XML::Twig:


use strict;
use warnings;

use XML::Twig;

sub print_full_review {
    my ( $twig, $full_review ) = @_;
    my $pageid =
        $twig->root->get_xpath( '/products/product/pageid', 0 )->text;

    print join(
        @ARGV[ 1 .. 4 ],
        join( ";",
            map { $_->text }
                $full_review->first_child('confirmstatusgroup')->children() )

my $twig = XML::Twig->new(
    'pretty_print'  => 'indented_a',
    'twig_handlers' => { 'fullreview' => \&print_full_review }
$twig->parsefile( $ARGV[0] );

The handlers 'print_full_review' is triggered each time the parser encounters a fullreview element (at any level in the tree - you can be more specific by setting it to process /product/products/reviews/fullreview if that's a problem).

This handler is passed the fullreview element for processing.

And from it we extract the values you seek.

join( ";",
    map { $_->text }
        $full_review->first_child('confirmstatusgroup')->children() )

Is a slightly more complicated way of doing:

my $confirmstatusgroup = $full_review -> first_child('confirmstatusgroup');
foreach my $confirmstatus ( $confirmstatusgroup -> children ) { 
    print $confirmstatus -> text,";";

But the code above produces your desired output, but without having to do any sort of 'suppressempty' fudges like you would with XML::Simple.