when I crawl this site. empty json file is created. I am trying to follow along in a Udemy course and have asked about this there but have not received a reply. I have checked my code but don't see anything obvious. I was initially doing this in on PC in windows 11 but found info in scrapy-playwrght documentation that says that it doesn't work in windows and recommended Linux. I switched over to a MacBook with Ventura and get only 1 error as opposed to many on PC, but still getting empty json file. I don't know what to check. I have shown below: 1.my code 2 the end of terminal output 3. lines from my settings.py
scrapy crawl gdp -O gdp.json
import scrapy
from scrapy_playwright.page import PageMethod
class PositionsSpider(scrapy.Spider):
name = "positions"
allowed_domains = ["traf.com"]
start_urls = ["https://careers.trafigura.com/TrafiguraCareerSite/search"]
def start_requests(self):
yield scrapy.Request(
self.start_urls[0],
meta=dict(
playwright=True,
playwright_page_methods = [
PageMethod("wait_for_selector", 'section#results div[role="list"]')
]
)
)
async def parse(self, response):
for job in response.css('section#results div[role="list"] div[role="listitem"]'):
yield {
'title': job.css('a::text').get()
}
FRom settings.py
DOWNLOAD_HANDLERS = {
"http": "scrapy_playwright.handler.ScrapyPlaywrightDownloadHandler",
"https": "scrapy_playwright.handler.ScrapyPlaywrightDownloadHandler"
}
TWISTED_REACTOR = "twisted.internet.asyncioreactor.AsyncioSelectorReactor"
2023-08-16 10:13:49 [scrapy.core.scraper] ERROR: Error downloading <GET https://careers.trafigura.com/TrafiguraCareerSite/search>
Traceback (most recent call last):
File "/Users/Steve/PythonProjects/scrapy-with-js-2/venv/lib/python3.11/site-packages/twisted/internet/defer.py", line 1693, in _inlineCallbacks
result = context.run(
File "/Users/Steve/PythonProjects/scrapy-with-js-2/venv/lib/python3.11/site-packages/twisted/python/failure.py", line 518, in throwExceptionIntoGenerator
return g.throw(self.type, self.value, self.tb)
File "/Users/Steve/PythonProjects/scrapy-with-js-2/venv/lib/python3.11/site-packages/scrapy/core/downloader/middleware.py", line 52, in process_request
return (yield download_func(request=request, spider=spider))
File "/Users/Steve/PythonProjects/scrapy-with-js-2/venv/lib/python3.11/site-packages/twisted/internet/defer.py", line 1065, in adapt
extracted = result.result()
File "/Users/Steve/PythonProjects/scrapy-with-js-2/venv/lib/python3.11/site-packages/scrapy_playwright/handler.py", line 297, in _download_request
result = await self._download_request_with_page(request, page, spider)
File "/Users/Steve/PythonProjects/scrapy-with-js-2/venv/lib/python3.11/site-packages/scrapy_playwright/handler.py", line 359, in _download_request_with_page
server_addr = await response.server_addr()
File "/Users/Steve/PythonProjects/scrapy-with-js-2/venv/lib/python3.11/site-packages/playwright/async_api/_generated.py", line 559, in server_addr
return mapping.from_impl_nullable(await self._impl_obj.server_addr())
File "/Users/Steve/PythonProjects/scrapy-with-js-2/venv/lib/python3.11/site-packages/playwright/_impl/_network.py", line 538, in server_addr
return await self._channel.send("serverAddr")
File "/Users/Steve/PythonProjects/scrapy-with-js-2/venv/lib/python3.11/site-packages/playwright/_impl/_connection.py", line 61, in send
return await self._connection.wrap_api_call(
File "/Users/Steve/PythonProjects/scrapy-with-js-2/venv/lib/python3.11/site-packages/playwright/_impl/_connection.py", line 482, in wrap_api_call
return await cb()
File "/Users/Steve/PythonProjects/scrapy-with-js-2/venv/lib/python3.11/site-packages/playwright/_impl/_connection.py", line 97, in inner_send
result = next(iter(done)).result()
playwright._impl._api_types.Error: Target page, context or browser has been closed