Since yesterday, I encountered an issue where my facebook marketplace scraper ceased to fetch data, I'm currently using scrapy due to his features, am I doing any mistakes? Output has been shared on my gist
Current code below
from scrapy import Spider
import logging
class Facebook(Spider):
name = 'facebook'
start_urls = ["https://www.facebook.com/marketplace/112047398814697/search?query=funko&sortBy=creation_time_descend&radius=500"]
def parse(self, response):
from pdb import set_trace; set_trace()
# get all HTML product elements
products = response.xpath('//div[@style="max-width:1872px"]/div[2]/div')
# iterate over the list of products
for product in products:
# return a generator for the scraped item
yield {
"name": product.css("h2::text").get(),
"image": product.css("img").attrib["src"],
"price": product.css('span::text').getall()[2],
"url": product.css("a").attrib["href"],
}
I've already tested with selenium and requests-html, but they don't works like expected.
The data is now present in the script tags as a JSON and that's why you are not able to extract any products details. You need to get the JSON string and then convert it to dictionary to access the necessary details from it. You can find the code snippet below.
OUTPUT(Sample of top 5 records)