I'm trying to mine quite a bit of text from a list of links using Selenium/Python.
In this example, I scrape only one of the pages and that successfully grabs the full text:
page = 'https://xxxxxx.net/xxxxx/September%202020/2020-09-24'
driver = webdriver.Firefox()
driver.get(page)
elements = driver.find_element_by_class_name('text').text
elements
Then, when I try to loop through the whole list of links (all the by day links on this page: https://overrustlelogs.net/Destinygg%20chatlog/September%202020) (using the same method that worked for grabbing the text from a single page), it is not grabbing the full text:
for i in tqdm(chat_links):
driver.get(i)
#driver.implicitly_wait(200)
elements = driver.find_element_by_class_name('text').text
#elements = driver.find_element_by_xpath('/html/body/main/div[1]/div[1]').text
#elements = elements.text
temp={'elements':elements}
chat_text.append(temp)
driver.close()
chat_text
My thought is that maybe it doesn't have the chance to load the whole thing, but it works on the single page. Also, the driver.get method seems meant to load the whole given page.
Any ideas? Thanks, much appreciated.
The page is lazy loading you need scroll the pages and add data in the list.