xpath issue in nested div

40 Views Asked by Kevin McCormick At 28 March 2024 at 09:05

New to python/scrapy. I am testing responses via xpath in the console and am able to print the h1 header as a test using the code below. Now I am trying to select the xpath to pull the (1) job title, (2) job URL

Here is my console code:

    r = scrapy.Request(url='https://www.northropgrumman.com/jobs?remote=yes-may-consider-full-time-teleworking-for-this-position&country=united-states-of-america&_job_category=global-supply-chain,business-management,program-management')
    
    fetch(r)

     #this works and pulls "Job Search" header at top of page
    response.xpath('//h1/text()').getall()
    
    # broken, tried many combos of xpaths to get job title and url
    response.xpath("/html/body/div[1]/main/div[2]/div/div/div[3]/div[2]/div/div/div/div/div[1]/div[1]/div/div/div/div/div/div/div[1]/a/text()").getall()

What is the xpath for job titles and job URLs on the jobs listed on this page?

https://www.northropgrumman.com/jobs?remote=yes-may-consider-full-time-teleworking-for-this-position&country=united-states-of-america&_job_category=global-supply-chain,business-management,program-management

Original Q&A

There are 1 best solutions below

E.Wiest On 29 March 2024 at 19:34

XPath for job titles could be :

//div[@class="col-sm-9"]/a/@href

For job URLs :

//div[@class="col-sm-9"]/a/h2/text()

One liner for both :

//div[@class="col-sm-9"]/a/@href|//div[@class="col-sm-9"]/a/h2/text()

Results :

href="/jobs/Business-Management/Contract/United-States-of-America/Virginia/Fairfax/R10151186/principal-sr-principal-contract-administrator"
#text "Principal / Sr Principal Contract Administrator"
href="/jobs/Business-Management/Contract/United-States-of-America/California/Sunnyvale/R10153611/principal-senior-principal-contract-administrator-hybrid-or-full-time-remote-schedule"
#text "Principal / Senior Principal Contract Administrator (Hybrid or Full Time Remote Schedule)"
href="/jobs/Business-Management/Multi-Function/United-States-of-America/Maryland/Linthicum/R10150106/principal-pricing-analyst"
#text "Principal Pricing Analyst"
...

xpath issue in nested div

There are 1 best solutions below

Related Questions in PYTHON

Related Questions in XPATH

Related Questions in SCRAPY

Trending Questions

Popular # Hahtags

Popular Questions