(def testxml2
"<top>
<group>
<group>
<item>
<number>1</number>
</item>
<item>
<number>2</number>
</item>
<item>
<number>3</number>
</item>
</group>
<item>
<number>0</number>
</item>
</group>
</top>")
(def txml2 (zip-str testxml2))
(defn deep-items [x]
(zip-xml/xml-> x
:top
:group
:group
:item))
(count (deep-items txml2))
;; 1
(zip-xml/text (first (deep-items txml2)))
;; "0"
I'm trying to get the value of the inner :group
, but it seems to be getting caught on the outside one. It seems to be ignoring the second :group
.
The actual XML I'm trying to parse has a repeated nested <TheirTag><TheirTag>Foo</TheirTag></TheirTag>
pattern going on and I need to access each Foo individually. The XML is from a third party so I can't just restructure the XML to avoid this.
You can solve this using the Tupelo Forest library to process tree-like data structures. Besides explicit searching, it can also use wildcards like
zsh
. Documentation is ongoing, but this will give you a taste of what you can do:The part you really care about is here. There are 2 ways to search for nested nodes.
The second uses a wildcard
:**
like zsh, which matches zero or more directories.For cast (1), we see we found only items 1, 2, and 3:
For case (2), we found not only the doubly-nested items, but also the singly-nested item
0
:You didn't specify what downstream processing you needed.
Tupelo.Forest
is able to convert output into bothhiccup
andenlive
formats, plus it's own hiccup-inspiredbush
format and an enlive-inspiredtree
format.