How to access xml field with lxml?

366 Views Asked by At

Python 3.6, Lxml, Windows 10

I am getting crazy. I want to access the item field. But I always get the error:

AttributeError: 'cython_function_or_method' object has no attribute'item'

Everything else (address fields etc...) I can access without problems. How can I access the item fields (sku, amount etc...)?

I've used this code:

import requests
from lxml import objectify

url = "URL_TO_XML_FILE"
xml_content = requests.get(url).text.encode('utf-8')

xml = objectify.fromstring(xml_content)

for sale in xml.response.sales.sale:
    for item in sale.items.item:
        print(item.sku)

Here is the beginning of the xml:

<?xml version="1.0" encoding="ISO-8859-1"?>
<getnewsalesresult xmlns="https://pmcdn.priceminister.com/res/schema/getnewsales">
  <request>
    <version>2017-08-07</version>
    <user>SELLER</user>
  </request>  

  <response>
    <lastversion>2017-08-07</lastversion>
    <sellerid>95029358</sellerid>
    <sales>

      <sale>
        <purchaseid>297453287592813953</purchaseid>
        <purchasedate>15/12/2018-19:10</purchasedate>
        <deliveryinformation>
          <shippingtype>Normal</shippingtype>
          <isfullrsl>N</isfullrsl>

          <purchasebuyerlogin><![CDATA[LOGIN]]></purchasebuyerlogin>                  
          <purchasebuyeremail>EMAIL</purchasebuyeremail>        


            <deliveryaddress>
            <civility>Mme</civility>
            <lastname><![CDATA[Lastname]]></lastname>
            <firstname><![CDATA[Firstname]]></firstname>
            <address1><![CDATA[STREET]]></address1>
            <address2><![CDATA[]]></address2>
            <zipcode>13570</zipcode>
            <city><![CDATA[Paris]]></city>

            <country><![CDATA[France]]></country>
            <countryalpha2>FX</countryalpha2>

              <phonenumber1></phonenumber1>
              <phonenumber2>PHONENUMBER</phonenumber2>

            </deliveryaddress>

        </deliveryinformation>
        <items>

          <item>
            <sku><![CDATA[SKU1]]></sku>
            <advertid>411812243030</advertid>
            <advertpricelisted>
              <amount>15.99</amount>
              <currency>EUR</currency>
            </advertpricelisted>
            <itemid>551131040</itemid>
            <headline><![CDATA[HEADLINE]]></headline>
            <itemstatus><![CDATA[REQUESTED]]></itemstatus>
            <ispreorder>N</ispreorder>
            <isnego>N</isnego>
            <negotiationcomment></negotiationcomment>
            <price>
              <amount>15.99</amount>
              <currency>EUR</currency>
            </price>
            <isrsl>N</isrsl>
            <isbn></isbn>
            <ean>4363745894373857474; </ean>
            <paymentstatus><![CDATA[INCOMING]]></paymentstatus>
            <sellerscore></sellerscore>
          </item>
        </items>
      </sale>
      <sale>
2

There are 2 best solutions below

1
On BEST ANSWER

The problem is that items is actually a method of ObjectifiedElement, so the expression sale.items actually returns the method, because it has precedence.

To get the 'items' object you want, you have to be more explicit about getting the attribute of sale and not looking for methods of the class first, which is the usual python order. This is what python does behind the scene when you access an attribute, and you can do it too:

sale.__getattr__('items')

This will also work (it's a dictionary-like interface to the attributes of an object):

sale.__dict__['items']

The revised code:

import requests
from lxml import objectify

url = "URL_TO_XML_FILE"
xml_content = requests.get(url).text.encode('utf-8')

xml = objectify.fromstring(xml_content)

for sale in xml.response.sales.sale:
    for item in sale.__dict__['items'].item:
        print(item.sku)
0
On

Another way to deal with this is to avoid using the flaky attribute interface:

for sale in xml['response']['sales']['sale']:
    for item in sale['items']['item']:
        print(item['sku'])

Using the dict-like indexing interface, you never have to worry about certain attributes names (which includes such common words as items, index, keys, remove, replace, tag, set, text, and values) returning surprising results.