Clicking two consecutive buttons while scraping a website with selenium in python

262 Views Asked by At

I am trying to scrape the country information from the website below, https://www.morningstar.com/etfs/xnas/vnqi/portfolio which entails clicking the 'Country' selection in the Exposure section, then moving through the 1, 2,3, etc. pages using the arrows at the bottom of the section. Nothing I have tried seems to work. Is there a way to do it using selenium in Python?

Many thanks!

Here is the code I used:

    urlpage   = 'https://www.morningstar.com/etfs/xnas/vnqi/portfolio'
    driver = webdriver.Chrome(options=options, executable_path='D:\Python\Python38\chromedriver.exe')
    driver.get(urlpage)
    elements=WebDriverWait(driver, 10).until(EC.presence_of_all_elements_located((By.XPATH, "//a[text()='Country']")))
    for elem in elements:
        elem.click()

and this is the error message:

TimeoutException                          

Traceback (most recent call last)  
<ipython-input-3-bf16ea3f65c0> in <module>  
    23 driver = webdriver.Chrome(options=options, executable_path='D:\Python\Python38\chromedriver.exe')  
     24 driver.get(urlpage)  
---> 25 elements=WebDriverWait(driver, 10).until(EC.presence_of_all_elements_located((By.XPATH, "//a[text()='Country']")))  
     26 for elem in elements:  
     27      elem.click()  
D:\Anaconda\lib\site-packages\selenium\webdriver\support\wait.py in until(self, method, message)  
     78             if time.time() > end_time:  
     79                 break  
---> 80         raise TimeoutException(message, screen, stacktrace)  
     81   
     82     def until_not(self, method, message=''):  
TimeoutException: Message: 

Sorry, not sure how to format the error message better. Thanks again.

1

There are 1 best solutions below

6
On BEST ANSWER

It seems you didn't check what you really have in HTML. So you didn't do the most important thing.

There is NO <a> with text Country on this page.

There is <input> with value="Country"


This code works for me

import time
from selenium import webdriver

url = 'https://www.morningstar.com/etfs/xnas/vnqi/portfolio'

driver = webdriver.Chrome()
driver.get(url)

time.sleep(2)

country = driver.find_element_by_xpath('//input[@value="Country"]')
country.click()

time.sleep(1)
next_page = driver.find_element_by_xpath('//a[@aria-label="Go to Next Page"]')
    
while True:
    
    # get data
    table_rows = driver.find_elements_by_xpath('//table[@class="sal-country-exposure__country-table"]//tr')
    for row in table_rows[1:]:  # skip header 
        elements = row.find_elements_by_xpath('.//span')  # relative xpath with `.//`
        print(elements[0].text, elements[1].text, elements[2].text)

    # check if there is next page
    disabled = next_page.get_attribute('aria-disabled')
    #print('disabled:', disabled)
    if disabled:
        break

    # go to next page        
    next_page.click()
    
    time.sleep(1)

Result

Japan 22.08 13.47
China 10.76 1.45
Australia 9.75 6.05
Hong Kong 9.52 6.04
Germany 8.84 5.77
Singapore 6.46 4.33
United Kingdom 6.22 5.77
Sweden 3.48 2.00
France 3.18 2.58
Canada 2.28 2.92
Switzerland 1.78 0.69
Belgium 1.63 1.31
Philippines 1.53 0.15
Israel 1.47 0.16
Thailand 0.98 0.09
India 0.87 0.11
South Africa 0.87 0.21
Taiwan 0.83 0.08
Mexico 0.80 0.33
Spain 0.62 0.84
Malaysia 0.54 0.08
Brazil 0.52 0.06
Austria 0.51 0.16
New Zealand 0.41 0.21
Indonesia 0.37 0.02
Norway 0.37 0.29
United States 0.29 44.09
Netherlands 0.24 0.19
Chile 0.21 0.01
Ireland 0.16 0.19
South Korea 0.15 0.00
Turkey 0.08 0.02
Russia 0.08 0.00
Finland 0.06 0.16
Poland 0.05 0.00
Greece 0.05 0.00
Italy 0.02 0.05
Argentina 0.00 0.00
Colombia 0.00 0.00
Czech Republic 0.00 0.00
Denmark 0.00 0.00
Estonia 0.00 0.00
Hungary 0.00 0.00
Latvia 0.00 0.00
Lithuania 0.00 0.00
Pakistan 0.00 0.00
Peru 0.00 0.00
Portugal 0.00 0.00
Slovakia 0.00 0.00
Venezuela 0.00 0.00