I am using the scrapy shell to extract some text data. Here are the commands i gave in the scrapy shell:
>>> scrapy shell "http://jobs.parklandcareers.com/dallas/nursing/jobid6541851-nurse-resident-cardiopulmonary-icu-feb2015-nurse-residency-requires-contract-jobs"
>>> response.xpath('//*[@id="jobDesc"]/span[1]/text()')
[<Selector xpath='//*[@id="jobDesc"]/span[1]/text()' data=u'Dallas, TX'>]
>>> response.xpath('//*[@id="jobDesc"]/span[2]/p/text()[2]')
[<Selector xpath='//*[@id="jobDesc"]/span[2]/p/text()[2]' data=u'Responsible for attending assigned nursi'>]
>>> response.xpath('//*[@id="jobDesc"]/span[2]/p/text()[preceding-sibling::*="Education"][following-sibling::*="Certification"]')
[]
The third command is not returning any data. I was trying to extract data between 2 keywords in the command. Where am i wrong?
//*[@id="jobDesc"]/span[2]/p/text()
would return you a list of text nodes. You can filter the relevant nodes in Python. Here's how you can get the text between "Education/Experience:" and "Certification/Registration/Licensure:" text paragraphs:UPD (regarding an additional question in comments):