ReactorNotRestartable error when running two spiders sequentially using CrawlerProcess

392 Views Asked by At

I'm trying to run two spiders sequentially, here is the structure of my module

class tmallSpider(scrapy.Spider):
    name = 'tspider'
    ...

class jdSpider(scrapy.Spider):
    name = 'jspider'
    ...

process = CrawlerProcess(get_project_settings())
process.crawl('tspider')
process.crawl('jspider')
process.start(stop_after_crawl=False)

When I run this, I get this error:

raise error.ReactorNotRestartable()
twisted.internet.error.ReactorNotRestartable

When I scroll in the terminal, I see that two spiders are successfully ran and the data I want to get are successfully scraped. However, the error occurs at the end and I guess it's because the process can't terminate? I tried process.stop but it does not work. I also tried the code on the official guideline (https://docs.scrapy.org/en/latest/topics/practices.html) but that one causes a spider not found error. Any ides how to fix it?

1

There are 1 best solutions below

5
AaronS On

Have you tried CrawlRunner and the example the Docs give? CrawlerRunner is for useful for running multiple spiders and being able to manaully stop.

If you have, could you provide a minimal example of your code for that and the explicit error message you get.