Web scraping with Selenium not capturing full text

720 Views Asked by Casey Cushing At 28 July 2025 at 21:34

I'm trying to mine quite a bit of text from a list of links using Selenium/Python.

In this example, I scrape only one of the pages and that successfully grabs the full text:

    page = 'https://xxxxxx.net/xxxxx/September%202020/2020-09-24'

driver = webdriver.Firefox()

driver.get(page)

elements = driver.find_element_by_class_name('text').text

elements

Then, when I try to loop through the whole list of links (all the by day links on this page: https://overrustlelogs.net/Destinygg%20chatlog/September%202020) (using the same method that worked for grabbing the text from a single page), it is not grabbing the full text:

for i in tqdm(chat_links):
driver.get(i)
#driver.implicitly_wait(200)
elements = driver.find_element_by_class_name('text').text
#elements = driver.find_element_by_xpath('/html/body/main/div[1]/div[1]').text
#elements = elements.text
temp={'elements':elements}
chat_text.append(temp)

driver.close()

chat_text

My thought is that maybe it doesn't have the chance to load the whole thing, but it works on the single page. Also, the driver.get method seems meant to load the whole given page.

Any ideas? Thanks, much appreciated.

Original Q&A

There are 1 best solutions below

KunduK On 19 October 2020 at 18:30

The page is lazy loading you need scroll the pages and add data in the list.

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

driver=webdriver.Chrome()
driver.get("https://overrustlelogs.net/Destinygg%20chatlog/September%202020/2020-09-30")
WebDriverWait(driver,10).until(EC.visibility_of_element_located((By.CSS_SELECTOR,".text>span")))
height=driver.execute_script("return document.body.scrollHeight")
data=[]
while True:
    driver.execute_script("window.scrollTo(0,document.body.scrollHeight)")
    time.sleep(1)
    for item in driver.find_elements_by_css_selector(".text>span"):
        if item.text in data:
            continue
        else:
            data.append(item.text)

    lastheight=driver.execute_script("return document.body.scrollHeight")
    if height==lastheight:
        break
    height=lastheight

print(data)

Web scraping with Selenium not capturing full text

There are 1 best solutions below

Related Questions in PYTHON

Related Questions in SELENIUM

Related Questions in WEB-SCRAPING

Related Questions in CSS-SELECTORS

Related Questions in WEBDRIVERWAIT

Trending Questions

Popular # Hahtags

Popular Questions