Unable to load desktop version of a webpage using scrapy-playwright (python)

89 Views Asked by At

I am trying to scrape (for educational purposes) a dynamically rendered page using scrapy-playwright but due to the browser dimensions being small, its always opening the mobile version hindering my scraping functions

I searched the playwright documentation and found two solutions. First, set the viewport to full-screen for desktop version or mimic a desktop page via devices available in Emulation.

set_viewport_size = https://playwright.dev/python/docs/emulation#devices emulation = https://playwright.dev/python/docs/emulation#devices

Found it hard to implement emulation so thought to go via set_viewport PageMethod.

I tried including it in meta under PageMethod but still the page opens in the as usual size and terminates on its own without crawling.

here is a snippet from my start_requests function

url = f"{base_url}{encoded_medicine}"
        
        meta = {
        "playwright": True,
        "playwright_page_methods": [
            #PageMethod("set_viewport_size",'({"width": 1920, "height": 1080})'),
            PageMethod("wait_for_selector", f"(//div[contains(@class,'grid-container')]/descendant::div[contains(@class,'style__container')])[{self.first_page_med}]"),
        ],
        "playwright_include_page": True
        }
        headers={"accept":"text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9",
                 "accept-encoding":"gzip, deflate, br",
                 "accept-language":"en-US,en;q=0.9,de;q=0.8",
                 "user-agent":UserAgent().chrome}
        
        yield scrapy.Request(url, meta=meta, errback=self.close_page, callback=self.parse, method='GET', headers=headers)
0

There are 0 best solutions below