Get child element based on value of different child element

368 Views Asked by At

Trying to get the title of all elements in XML file in which the child2 attribute ipsum text content is "true", using ElementTree in Python. Here's the XML:

<element>
    <title>Hello world</title>
    <child1>Lorem</child1>
    <child2 attr="ipsum">true</child2>
</element>
<element>
    <title>Hello world 2</title>
    <child1>Lorem</child1>
</element>
<element>
    <title>Hello world 3</title>
    <child1>Lorem</child1>
    <child2 attr="ipsum">true</child2>
</element>

The result I would like returned is a list that consists of the following titles:

Hello world
Hello world 3

The second element with title "Hello world 2" would be excluded because of missing this child element:

<child2 attr="ipsum">true</child2>

Any tips?

1

There are 1 best solutions below

0
On

Part of the problem is that ipsum isn't an attribute in the xml you posted. It is the value of the attribute attr. Another example of this would be if you said <child2 name="John"></child2>. In this case, name is the attribute and John is the value of that attribute. The following code should work if you are trying to maintain your current xml structure:

import xml.etree.ElementTree as ET
tree = ET.parse('elements.xml')
root = tree.getroot()

for element in root:
    childEl = element.find('child2')
    if childEl != None and childEl.text == "true":
        if element.find('title') != None:
            print element.find('title').text

It loops over the elements, checking the child2 element to make sure it exists, and whether or not the text value of that element is "true".

If you want your code to behave the way you described with ipsum being set to true, then you can use the following code and xml examples:

<?xml version="1.0"?>
<data>
    <element>
        <title>Hello world</title>
        <child1>Lorem</child1>
        <child2 ipsum="true"></child2>
    </element>
    <element>
        <title>Hello world 2</title>
        <child1>Lorem</child1>
    </element>
    <element>
        <title>Hello world 3</title>
        <child1>Lorem</child1>
        <child2 ipsum="true"></child2>
    </element>    
</data>

import xml.etree.ElementTree as ET
tree = ET.parse('elements.xml')
root = tree.getroot()

for element in root:
    childEl = element.find('child2')
    if childEl != None and childEl.get("ipsum") == "true":
        if element.find('title') != None:
            print element.find('title').text

Hope this helps! Let me know if there's anything that you'd like me to clarify.