Strange behavior with tagsoup and Groovy's XmlSlurper

846 Views Asked by AudioBubble At 27 January 2011 at 02:44

Let's say I want to parse the phone number from an an xml string like this:

str = """ <root> 
            <address>123 New York, NY 10019
                <div class="phone"> (212) 212-0001</div> 
            </address> 
        </root> 
    """
parser = new XmlSlurper(new org.ccil.cowan.tagsoup.Parser()).parseText (str)
println parser.address.div.text()

It doesn't print the phone number.

If I change the "div" element to "foo" like this

str = """ <root> 
            <address>123 New York, NY 10019
                <foo class="phone"> (212) 212-0001</foo> 
            </address> 
        </root> 
    """
parser = new XmlSlurper(new org.ccil.cowan.tagsoup.Parser()).parseText (str)
println parser.address.foo.text()

Then its able to parse and print the phone number.

What the heck is going on?

Btw I am using groovy 1.7.5 and tagsoup 1.2

Original Q&A

There are 3 best solutions below

oiavorskyi On 01 February 2011 at 20:24

Just change code to

println parser.address.'div'.text()

This is curse of Groovy and many other dynamic language - "div" is reserved method name thus you don't get node but rather try to divide "address" node :)

winstaan74 On 01 August 2011 at 13:15

I seem to recall that tagsoup normalizes HTML tags - i.e. it uppercases them. So the GPath expression you want is probably

println parser.ADDRESS.DIV.text()

I find it handy to be able to print out the result of the parse - then you can see why your GPath isn't working. Use this..

println groovy.xml.XmlUtil.serialize(parser)

DataScientYst On 22 July 2016 at 12:46

I know that this question is very old. But I faced recently and this is what I used:

parser.'**'.findAll { it.name() == 'div' && [email protected]() == 'phone' }.each { div ->
    println div.text()
}

Using depthFirst find all tags
Filter by name div that has class phone;
Print the value (212) 212-0001

Groovy version is 2.4

Strange behavior with tagsoup and Groovy's XmlSlurper

There are 3 best solutions below

Related Questions in XML

Related Questions in PARSING

Related Questions in GROOVY

Related Questions in TAG-SOUP

Trending Questions

Popular # Hahtags

Popular Questions