I'm coding in R and building a web scraping script to programatically search on Google for product images and download them into a folder. I've got a for-loop where there is a step inside to get the image URLs from the Google Image result page
#Define the desired Google image search page
page <- read_html("https://www.google.com/search?q=Djeco%20DD04490%20image&tbm=isch&tbs=isz:lt,islt:0.5")
#Fetch the image urls programatically
image_urls <- page %>% html_nodes(".rg_i") %>% html_attr("data-src")
#Continue the rest flow and download the image jpg files from the image url list
...
However, the image_urls is always empty and can't proceed further.
How may I resolve this and fetch the image urls from the example page?
You can find all the links in the
hrefattribute of theatags withintdtags. You can then use string parsing to get the urls:Resulting in:
Created on 2023-07-30 with reprex v2.0.2