Scrape the feature image from this website but it returns this `data:image/gif

193 Views Asked by Info Rewind At 19 October 2022 at 06:44

Using Scrapy and Scrapy shell in python to scrape the feature image from this website https://www.thrillist.com/travel/nation/all-the-ways-to-cool-off-in-austin but it returns this data:image/gif;base64,R0 instead of src of the image, I need the help of someone if any one tell me the way to fix this to get src of the image

Here is my Code

Feature_Image = [i.strip() for i in response.xpath('//*[@id="main-content"]/article/div/div/div[2]/div[1]/picture/img/@src').getall()][0]

Original Q&A

There are 2 best solutions below

Barry the Platipus On 19 October 2022 at 11:08 BEST ANSWER

The biggest image on that page would be the one marked (somehow) for Desktop - common sense logic. So why not try to locate its source like below?

pic = response.xpath('//picture[@data-testid="picture-tag"]//source[@data-size="desktop"]/@srcset').get()

Result is the source for the biggest size for that page poster:

https://assets3.thrillist.com/v1/image/3086882/1584x1056/crop;webp=auto;jpeg_quality=60;progressive.jpg

Alexander On 19 October 2022 at 08:04

It looks like the tag has a data-src attribute that holds the link and some image attributes. Parsing the text and extracting the first section get's you the link.

>>> link = response.xpath("//div[@data-element-type='ParagraphMainImage']//img/@data-src").get().split(";")[0]
>>> link
'https://assets3.thrillist.com/v1/image/3086882/414x310/crop'

You can add manually add .jpg to the end if you want to be able to differentiate what type of image it is. The link works with and without the extension.

Scrape the feature image from this website but it returns this `data:image/gif

There are 2 best solutions below

Related Questions in PYTHON

Related Questions in WEB-SCRAPING

Related Questions in XPATH

Related Questions in SCRAPY

Related Questions in SCRAPY-SHELL

Trending Questions

Popular # Hahtags

Popular Questions