Splash not rendering a webpage completely

196 Views Asked by At

I am trying to use scrapy + splash to scrape this site https://www.teammitsubishihartford.com/new-inventory/index.htm?compositeType=new. But i am unable to extract any data from the site. When I try rendering the webpage using splash api (browser), I came to know that the site is not fully loaded (splash rendering returns a partially loaded website image). How can I render the site completly??

1

There are 1 best solutions below

1
On

@Vinu Abraham, If your requirement is not specific to scrapy + splash, you can use selenium. This issue occurs when we try to scrape a dynamic site. Below is the code snippet for reference.

import requests
from bs4 import BeautifulSoup
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
import time
import re
from csv import writer

# url of the page we want to scrape
url = 'https://www.*******/drugs-all-medicines'

driver = webdriver.Chrome('./chromedriver')
driver.get(url)
time.sleep(5)

html = driver.page_source
soup = BeautifulSoup(html, "html.parser")
all_divs = soup.find('div', {'class': 'style__container___1i8GI'})

Also let me know if you get any solution for the same using scrapy.