Scrapy shell response returns less items than Selector Gadget extension in chrome for the same class

151 Views Asked by At

I was scraping job applications from below portal using scrapy. But i get only 10 items in scrapy shell for the class which shows 15 items in developer tools and via selector gadget. I am confused about this difference.

Page tested on: https://www.waahjobs.com/s/software-developer-jobs-in-mumbai/

Class selected using Selector Gadget Extension: .r-95jzfe .css-1dbjc4n .r-1pn2ns4

Number of items: 15 (also counted manually.)

Scrapy shell input:

scrapy shell "https://www.waahjobs.com/s/software-developer-jobs-in-mumbai/"

obj = response.css(".r-95jzfe .css-1dbjc4n .r-1pn2ns4") print(len(obj))

Scrapy shell output: 10

Expected output: 15

Update: Bypassed the need to scrape data by directly hitting backend. Useful link to convert curl request to Scrapy code - https://michael-shub.github.io/curl2scrapy/

But still facing problem on some websites even after using scrapy-splash.

What i did:

  1. Integrated scrapy with splash
  2. Started splash on localhost using docker
  3. then performed command fetch('http://localhost:8050/render.html?url=https://www.hirist.com/login') on scrapy terminal.

Result: view(response) gives 404 on chrome

Expectation: https://quotes.toscrape.com/ works but https://www.hirist.com doesn't.

As you can see in this image that splash is not able to load the page. HTML is also not readable. HTML Contains correct data though

Kindly help.

0

There are 0 best solutions below