Scrapy does not enable my FilePipeline

1.2k Views Asked by At

This is my settings.py:

from scrapy.log import INFO


BOT_NAME = 'images'

SPIDER_MODULES = ['images.spiders']
NEWSPIDER_MODULE = 'images.spiders'
LOG_LEVEL = INFO

ITEM_PIPELINES = {
    "images.pipelines.WritePipeline": 800
}

DOWNLOAD_DELAY = 0.5

This is my pipelines.py:

from scrapy import Request
from scrapy.pipelines.files import FilesPipeline


class WritePipeline(FilesPipeline):

    def get_media_requests(self, item, info):
        for url in item["file_urls"]:
            yield Request(url)

    def item_completed(self, results, item, info):
        return item

It is very standard, normal stuff. And yet this is a line of my log:

2015-06-25 18:16:41 [scrapy] INFO: Enabled item pipelines: 

So the pipeline is not enabled. What am I doing wrong here? I've used Scrapy a few times now, and I'm fairly positive the spider is fine. The item is just a normal item with file_urls and files.

2

There are 2 best solutions below

1
On BEST ANSWER

Whoops, I forgot to add a FILES_STORE in the settings. Look here for an explanation.

Relevant quote:

Then, configure the target storage setting to a valid value that will be used for storing the downloaded images. Otherwise the pipeline will remain disabled, even if you include it in the ITEM_PIPELINES setting.

1
On

I don't really know about FilesPipeline, but for every pipeline you need to implement the process_item(self, item, spider) method.