ReactorNotRestartable error when running two spiders sequentially using CrawlerProcess

392 Views Asked by Tianhe Xie At 09 July 2020 at 08:46

I'm trying to run two spiders sequentially, here is the structure of my module

class tmallSpider(scrapy.Spider):
    name = 'tspider'
    ...

class jdSpider(scrapy.Spider):
    name = 'jspider'
    ...

process = CrawlerProcess(get_project_settings())
process.crawl('tspider')
process.crawl('jspider')
process.start(stop_after_crawl=False)

When I run this, I get this error:

raise error.ReactorNotRestartable()
twisted.internet.error.ReactorNotRestartable

When I scroll in the terminal, I see that two spiders are successfully ran and the data I want to get are successfully scraped. However, the error occurs at the end and I guess it's because the process can't terminate? I tried process.stop but it does not work. I also tried the code on the official guideline (https://docs.scrapy.org/en/latest/topics/practices.html) but that one causes a spider not found error. Any ides how to fix it?

Original Q&A

There are 1 best solutions below

AaronS On 09 July 2020 at 09:46

Have you tried CrawlRunner and the example the Docs give? CrawlerRunner is for useful for running multiple spiders and being able to manaully stop.

If you have, could you provide a minimal example of your code for that and the explicit error message you get.

ReactorNotRestartable error when running two spiders sequentially using CrawlerProcess

There are 1 best solutions below

Related Questions in PYTHON

Related Questions in SCRAPY

Related Questions in WEB-CRAWLER

Related Questions in PYSPIDER

Trending Questions

Popular # Hahtags

Popular Questions