Scrapy response.xpath not returning anything for a query

800 Views Asked by Abhishek At 14 August 2025 at 08:46

I am using the scrapy shell to extract some text data. Here are the commands i gave in the scrapy shell:

>>> scrapy shell "http://jobs.parklandcareers.com/dallas/nursing/jobid6541851-nurse-resident-cardiopulmonary-icu-feb2015-nurse-residency-requires-contract-jobs"

>>> response.xpath('//*[@id="jobDesc"]/span[1]/text()')
[<Selector xpath='//*[@id="jobDesc"]/span[1]/text()' data=u'Dallas, TX'>]
>>> response.xpath('//*[@id="jobDesc"]/span[2]/p/text()[2]')
[<Selector xpath='//*[@id="jobDesc"]/span[2]/p/text()[2]' data=u'Responsible for attending assigned nursi'>]
>>> response.xpath('//*[@id="jobDesc"]/span[2]/p/text()[preceding-sibling::*="Education"][following-sibling::*="Certification"]')
[]

The third command is not returning any data. I was trying to extract data between 2 keywords in the command. Where am i wrong?

Original Q&A

There are 2 best solutions below

alecxe On 01 December 2014 at 18:33 BEST ANSWER

//*[@id="jobDesc"]/span[2]/p/text() would return you a list of text nodes. You can filter the relevant nodes in Python. Here's how you can get the text between "Education/Experience:" and "Certification/Registration/Licensure:" text paragraphs:

>>> result = response.xpath('//*[@id="jobDesc"]/span[2]/p/text()').extract()
>>> start = result.index('Education/Experience:')
>>> end = result.index('Certification/Registration/Licensure:')
>>> print ''.join(result[start+1:end])
- Must be a graduate from an accredited school of Nursing.

UPD (regarding an additional question in comments):

>>> response.xpath('//*[@id="jobDesc"]/span[3]/text()').re('Job ID: (\d+)')
[u'143112']

M. Page On 01 December 2014 at 18:14

Try:

substring-before(
  substring-after('//*[@id="jobDesc"]/span[2]/p/text()', 'Education'), 'Certification')

Note: I couldn't test it.

The idea is that you cannot use preceding-sibling and following-sibling because you look in the same text node. You have to extract the text part that you want using substring-before() and substring-after()

By combining those two functions, you select what is in between.

Scrapy response.xpath not returning anything for a query

There are 2 best solutions below

Related Questions in SHELL

Related Questions in XPATH

Related Questions in WEB-SCRAPING

Related Questions in SCRAPY

Related Questions in SCRAPY-SHELL

Trending Questions

Popular # Hahtags

Popular Questions