The Short Problem
When checking what type of an item is being passed to the pipeline, Scrapy is giving me a class of scrapy.item.ItemMeta instead of the seemingly obvious class.
The Context
pipelines.py:
def process_item(self, item, spider):
print(type(item))
print(type(WikiItem))
The above statement yields
<class 'MyScrapers.items.WikiItem'>
<class 'scrapy.item.ItemMeta'>
Why is the second type() statement not printing a value of WikiItem despite having the class explicitly passed? How can I make it so that they do match?
Additional Info
In the pipeline, there is a simple if statement using an if isintance() statement to perform actions depending on what kind of item is passed. The condition is never met. To debug this problem, I simply put two print statements to print the type of item being passed to the Scrapy Pipeline and the type of item it was being checked against.
Import Statement in pipelines.py
from openpyxl import Workbook
from items import ModelItem, WikiItem
Import Statement in items.py
import re
from scrapy.loader import ItemLoader
from itemloaders.processors import TakeFirst
from scrapy import Item, Field
...
class WikiItem(Item):
model_number = Field(default='', output_processor = TakeFirst())
...
In the first
type()statement, you are printing the type of the item instance, which is an instance ofWikiItemclass. That's why it prints<class 'MyScrapers.items.WikiItem'>.In the second
type()statement, you are printing the type ofWikiItemclass itself, which is represented byItemMetain Scrapy framework. When you define a Scrapy item class likeWikiItem, Scrapy generates a metaclassItemMetafor it, which is responsible for creating instances of the item class. This metaclass is a subclass of Python's built-in type. So, when you print the type ofWikiItem, it shows<class 'scrapy.item.ItemMeta'>.You can use
isinstancenormally:if isinstance(item, WikiItem): # do something(See the code below).If you specifically want to work with your class then just add parenthesis: