I tried to scrape images of a hotel on booking.com. If I need to get all images of the hotel, I can click on any image on the web page. Below is the html content
<a data-id="88516347" data-thumb-url="https://cf.bstatic.com/images/hotel/max500/885/88516347.jpg" href="https://cf.bstatic.com/images/hotel/max1024x768/885/88516347.jpg" target="_blank" class="bh-photo-grid-item bh-photo-grid-photo3 active-image " style="background-image: url(https://cf.bstatic.com/images/hotel/max500/885/88516347.jpg);" onclick="return false;" title="The sunrise or sunset as seen from the apartment or nearby ">
<img src="https://cf.bstatic.com/images/hotel/max500/885/88516347.jpg" class="hide" alt="The sunrise or sunset as seen from the apartment or nearby ">
</a>
If I click the image, it will open a modal with slideshow. Does anyone know how to fetch it by php Goutte?
I got a python sample about this. https://github.com/basophobic/booking_scraper/blob/ec4382dd00970df0dab4b4df5d67143f9bbc2b21/web_scrapping.py
# click on the first image to open the image carousel
driver.find_element_by_class_name('bh-photo-grid-item').click()
# find the number of images for every hotel
tmp1 = driver.find_element_by_class_name('bh-photo-modal-caption-left').text
tmp = tmp1.split()
img_number = int(tmp[2])
print(img_number)
accommodation_fields['images'] = list()
# loop through every image to save the link
for image in range(img_number-1):
img_href = driver.find_element_by_class_name('bh-photo-modal-image-element').find_element_by_tag_name('img').get_attribute('src')
#print(img_href)
accommodation_fields['images'].append(img_href)
driver.find_element_by_class_name('bh-photo-modal-image-element').click()
print("Total images are: " + str(img_number-1))
But I don't know to convert it to Goutte of php. Thank you. Regards.
there is a class ".bh-photo-grid-item" and you can use it to get images of rooms. Please try below code and check if it works for your cases: