How to get all leaf cell from an REXML element, and save them into a array?

599 Views Asked by At

Have a Ruby REXML element like below:

<a_1>
  <Tests>
    <test enabled='1'>trans </test>
    <test enabled='1'>ac </test>
    <test enabled='1'>dc </test>
  </Tests>
  <Corners>
    <corner enabled='0'>default</corner>
    <corner enabled='1'>C0 </corner>
  </Corners>
</a_1>

I want to find all leaf elements, so the result should be:

<test enabled='1'>trans </test>
<test enabled='1'>ac </test>
<test enabled='1'>dc </test>
<corner enabled='0'>default</corner>
<corner enabled='1'>C0 </corner>

My code is:

require 'rexml/document' 
include  REXML

def getAllLeaf(xmlElement)
  if xmlElement.has_elements?
    xmlElement.elements.each {|e| 
      getAllLeaf(e)
    }
  else
    return xmlElement
  end
end

It works fine and did show the right outputs on screen. However, I found I had a problem when I try to save the result to an Array, for this recursive procedure. So I wounder if there is a way to save this output to one array which can be used afterwards?

I struggled out a recursive way to do it, though a little odd, I would like to share it out:

def getAllLeaf(eTop,aTemp=Element.new("LeafElements"))
  if eTop.has_elements?
    eTop.elements.each {|e| 
      getAllLeaf(e,aTemp)
    }
  else
    aTemp<< eTop.dup
  end
  return aTemp
end
1

There are 1 best solutions below

5
On BEST ANSWER

It works fine and did show the right outputs on screen.

In fact, the code shows no outputs--anywhere. In any case, your recursive function doesn't work, which you can see if you call your method on the element <Tests> when <Tests> looks like this:

  <Tests>
    <test enabled='1'>
      <HELLO>world</HELLO>
    </test>
    <test enabled='1'>ac </test>
    <test enabled='1'>dc </test>
  </Tests>

Your recursive method doesn't work because when you write:

xmlElement.elements.each {|e|

the each() method returns the thing on it's left, i.e. xmlElement.elements. Given your xml, your recursive method is equivalent to:

def getAllLeaf(xmlElement)
    xmlElement.elements.each {|e| 
      "blah"  #your code here has no effect on what each() returns.
    }
end

..which is equivalent to:

def getAllLeaf(xmlElement)
    return xmlElement.elements
end

Do you want to stick with recursion? It's much simpler to search all the elements for the elements with no children:

require "rexml/document"
include REXML

xml = <<'END_OF_XML'
<a_1>
  <Tests>
    <test enabled='1'>trans </test>
    <test enabled='1'>ac </test>
    <test enabled='1'>dc </test>
  </Tests>
  <Corners>
    <corner enabled='0'>default</corner>
    <corner enabled='1'>C0 </corner>
  </Corners>
</a_1>
END_OF_XML

doc = Document.new xml
root = doc.root

XPath.each(root, "//*") do |element|
  if not element.has_elements?
    enabled = element.attributes['enabled'] 
    text = element.text
    puts "#{enabled} ... #{text}"
  end
end

--output:--
1 ... trans 
1 ... ac 
1 ... dc 
0 ... default
1 ... C0 

Or, if all the leaf elements are the only elements with the attribute "enabled", you should do this:

XPath.each(root, "//*[@enabled]") do |element|
  enabled = element.attributes['enabled'] 
  text = element.text
  puts "#{enabled} ... #{text}"
end

There's even a cryptic xpath that will directly select elements without element children:

XPath.each(root, "//*[not(*)]") do |element|
  enabled = element.attributes['enabled'] 
  text = element.text
  puts "#{enabled} ... #{text}"
end

Also, have you considered using the nokogiri gem? It's pretty much ruby's standard XML/HTML parser.