How to extract text with lxml in this scraper program?

1.2k Views Asked by At

I am trying to scrape the text data from a specific element on this page (using scraperwiki)

import requests
from lxml import html

response = requests.get(http://portlandmaps.com/detail.cfm?action=Assessor&propertyid=R246274)

tree = html.fromstring(response.content)
owner = tree.xpath('/html/body/div[2]/table[1]/tbody/tr[11]/td[2]')

print owner.text

And the scraperwiki console returns:

AttributeError: 'list' object has no attribute 'text'

I used Google Chrome to find the XPath but I assume requests uses the same standards as chrome

1

There are 1 best solutions below

0
On

that's because whatever you're looking for doesn't exist. Try the parents first.

And then, once that works try this:

owner[0].text

If you can't find/remember which tr you want, just grab all the tds of the 3rd index:

tree = html.fromstring(response.content)
owner = tree.xpath('/html/body/div[2]/table[1]/tbody/tr/td[2]')

texts = [o.text for o in owner]
print texts

Then, take your pick and modify code accordingly. Hope this helps.