I want to parse each individual statistic from the yahoo finance tables for formatting purposes - when parsing the entire table the formatting is terrible!! I am currently using the code below and I would have to repeat the 4 lines of contentA code slightly altered to retrieve the stats within each row of the table. This is exemplified in the contentB variables below. I refuse to believe this is the most efficient way to do so. Any suggestions?
from lxml import html
url = 'http://finance.yahoo.com/q/is?s=MMM+Income+Statement&annual'
tree = html.parse(url)
contentA = tree.xpath("//table[@class='yfnc_tabledata1']/tr[1]/td/table/tr[2]/td[1]")[0].text_content().strip()
contentA1 = tree.xpath("//table[@class='yfnc_tabledata1']/tr[1]/td/table/tr[2]/td[2]")[0].text_content().strip()
contentA2 = tree.xpath("//table[@class='yfnc_tabledata1']/tr[1]/td/table/tr[2]/td[3]")[0].text_content().strip()
contentA3 = tree.xpath("//table[@class='yfnc_tabledata1']/tr[1]/td/table/tr[2]/td[4]")[0].text_content().strip()
contentB = tree.xpath("//table[@class='yfnc_tabledata1']/tr[1]/td/table/tr[3]/td[1]")[0].text_content().strip()
contentB1 = tree.xpath("//table[@class='yfnc_tabledata1']/tr[1]/td/table/tr[3]/td[2]")[0].text_content().strip()
contentB2 = tree.xpath("//table[@class='yfnc_tabledata1']/tr[1]/td/table/tr[3]/td[3]")[0].text_content().strip()
contentG3 = tree.xpath("//table[@class='yfnc_tabledata1']/tr[1]/td/table/tr[3]/td[4]")[0].text_content().strip()
Use
range
andformat
Output
Output
EDIT