I was scraping job applications from below portal using scrapy. But i get only 10 items in scrapy shell for the class which shows 15 items in developer tools and via selector gadget. I am confused about this difference.
Page tested on: https://www.waahjobs.com/s/software-developer-jobs-in-mumbai/
Class selected using Selector Gadget Extension: .r-95jzfe .css-1dbjc4n .r-1pn2ns4
Number of items: 15 (also counted manually.)
Scrapy shell input:
scrapy shell "https://www.waahjobs.com/s/software-developer-jobs-in-mumbai/"
obj = response.css(".r-95jzfe .css-1dbjc4n .r-1pn2ns4") print(len(obj))
Scrapy shell output: 10
Expected output: 15
Update: Bypassed the need to scrape data by directly hitting backend. Useful link to convert curl request to Scrapy code - https://michael-shub.github.io/curl2scrapy/
But still facing problem on some websites even after using scrapy-splash.
What i did:
- Integrated scrapy with splash
- Started splash on localhost using docker
- then performed command fetch('http://localhost:8050/render.html?url=https://www.hirist.com/login') on scrapy terminal.
Result: view(response) gives 404 on chrome
Expectation: https://quotes.toscrape.com/ works but https://www.hirist.com doesn't.
Kindly help.