Python Selenium Webscraping: find_elements_by_xpath returning an empty list

5.4k Views Asked by At

I've taken a few coding subjects in uni and am trying to analyse tennis statistics by learning selenium which is completely new to me.

The page I'm using is here (https://www.atptour.com/en/scores/results-archive?year=2021) and I'm followinig a guide from this website here (https://www.scrapingbee.com/blog/selenium-python/ , https://www.scrapingbee.com/blog/practical-xpath-for-web-scraping/). The particular problem I'm having is in the second guide website under the subtitle "E-commerce product data extraction".

My Goal is to loop through the tournaments and extract the links located with the 'Results' button, but I'm having trouble as my program is just giving me an emptylist.

from selenium import webdriver
from selenium.webdriver.chrome.options import Options


DRIVER_PATH = "C:\Program Files (x86)\chromedriver.exe"
#driver = webdriver.Chrome(executable_path=DRIVER_PATH)
options = Options()
options.headless = True
options.add_argument("--window-size=1920,1200")
driver = webdriver.Chrome(options=options, executable_path=DRIVER_PATH)
#driver.get("https://www.nintendo.com/")
#print(driver.page_source)
#driver.quit()
# 1 Data Collection
# 1.1 Find Links to All Tournaments
tournaments_2021_url = "https://www.atptour.com/en/scores/results-archive?year=2021"
#tournament_class = "tourney-result"
driver.get(tournaments_2021_url) # print(driver.page_source)
tournaments_2021_url_list = driver.find_elements_by_xpath("//a[@class='button-border']")
print("\n tournament urls \n")
print(tournaments_2021_url_list)
print(len(tournaments_2021_url_list))
driver.quit()
# 1.2 For Each Tournament, Find Links to Each Match
# 1.3 For Each Match, Extract Relevant Statistics

I would expect to have a list of elements or some weird objects and be able to extract the links, but instead I get an empty list with len 0. Thanks for any help.

3

There are 3 best solutions below

1
On BEST ANSWER

I took your code and ran it and it's good. It does what it's supposed to. Thus, my advice is to run it thru a debugger and step thru to make sure everything goes as it's supposed to. Remove the headless option as well so you can visually confirm. Check your chrome browser version and make sure it matches with the chromedriver you're using. (although it should give you an error message if the versions don't match.) Finally, if all else fails, try it using another browser, firefox for example, and the appropriate geckodriver.

0
On

To print the value of the href attributes of all the RESULTS you need to induce WebDriverWait for the visibility_of_all_elements_located() and you can use either of the following Locator Strategies:

  • Using PARTIAL_LINK_TEXT:

    driver.get("https://www.atptour.com/en/scores/results-archive?year=2021")
    print([my_elem.get_attribute("href") for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.PARTIAL_LINK_TEXT, "Results")))])
    driver.quit()
    
  • Using CSS_SELECTOR:

    driver.get("https://www.atptour.com/en/scores/results-archive?year=2021")
    print([my_elem.get_attribute("href") for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "a[href$='results']")))])
    driver.quit()
    
  • Using XPATH:

    driver.get("https://www.atptour.com/en/scores/results-archive?year=2021")
    print([my_elem.get_attribute("href") for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.XPATH, "//a[normalize-space()='Results']")))])
    driver.quit()
    
  • Console Output:

    ['https://www.atptour.com/en/scores/archive/delray-beach/499/2021/results', 'https://www.atptour.com/en/scores/archive/antalya/9426/2021/results', 'https://www.atptour.com/en/scores/archive/auckland/301/2021/results', 'https://www.atptour.com/en/scores/archive/melbourne/8998/2021/results', 'https://www.atptour.com/en/scores/archive/melbourne/9428/2021/results', 'https://www.atptour.com/en/scores/archive/pune/891/2021/results', 'https://www.atptour.com/en/scores/archive/atp-cup/8888/2021/results', 'https://www.atptour.com/en/scores/archive/australian-open/580/2021/results', 'https://www.atptour.com/en/scores/archive/new-york/424/2021/results', 'https://www.atptour.com/en/scores/archive/rio-de-janeiro/6932/2021/results', 'https://www.atptour.com/en/scores/archive/singapore/9460/2021/results', 'https://www.atptour.com/en/scores/archive/cordoba/9158/2021/results', 'https://www.atptour.com/en/scores/archive/montpellier/375/2021/results', 'https://www.atptour.com/en/scores/archive/rotterdam/407/2021/results', 'https://www.atptour.com/en/scores/archive/buenos-aires/506/2021/results', 'https://www.atptour.com/en/scores/archive/doha/451/2021/results', 'https://www.atptour.com/en/scores/archive/marseille/496/2021/results', 'https://www.atptour.com/en/scores/archive/santiago/8996/2021/results', 'https://www.atptour.com/en/scores/archive/dubai/495/2021/results', 'https://www.atptour.com/en/scores/archive/acapulco/807/2021/results', 'https://www.atptour.com/en/scores/archive/miami/403/2021/results', 'https://www.atptour.com/en/scores/archive/marrakech/360/2021/results', 'https://www.atptour.com/en/scores/archive/cagliari/9481/2021/results', 'https://www.atptour.com/en/scores/archive/marbella/9462/2021/results', 'https://www.atptour.com/en/scores/archive/houston/717/2021/results', 'https://www.atptour.com/en/scores/archive/monte-carlo/410/2021/results', 'https://www.atptour.com/en/scores/archive/barcelona/425/2021/results', 'https://www.atptour.com/en/scores/archive/belgrade/5053/2021/results', 'https://www.atptour.com/en/scores/archive/estoril/7290/2021/results', 'https://www.atptour.com/en/scores/archive/munich/308/2021/results', 'https://www.atptour.com/en/scores/archive/madrid/1536/2021/results', 'https://www.atptour.com/en/scores/archive/rome/416/2021/results', 'https://www.atptour.com/en/scores/archive/geneva/322/2021/results', 'https://www.atptour.com/en/scores/archive/lyon/7694/2021/results', 'https://www.atptour.com/en/scores/archive/parma/9510/2021/results', 'https://www.atptour.com/en/scores/archive/belgrade/9512/2021/results', 'https://www.atptour.com/en/scores/archive/roland-garros/520/2021/results', 'https://www.atptour.com/en/scores/archive/s-hertogenbosch/440/2021/results', 'https://www.atptour.com/en/scores/archive/stuttgart/321/2021/results', 'https://www.atptour.com/en/scores/archive/halle/500/2021/results', 'https://www.atptour.com/en/scores/archive/london/311/2021/results', 'https://www.atptour.com/en/scores/archive/mallorca/8994/2021/results', 'https://www.atptour.com/en/scores/archive/eastbourne/741/2021/results', 'https://www.atptour.com/en/scores/archive/wimbledon/540/2021/results', 'https://www.atptour.com/en/scores/archive/hamburg/414/2021/results', 'https://www.atptour.com/en/scores/archive/newport/315/2021/results', 'https://www.atptour.com/en/scores/archive/bastad/316/2021/results', 'https://www.atptour.com/en/scores/archive/los-cabos/7480/2021/results', 'https://www.atptour.com/en/scores/archive/gstaad/314/2021/results', 'https://www.atptour.com/en/scores/archive/umag/439/2021/results', 'https://www.atptour.com/en/scores/archive/tokyo/96/2021/results', 'https://www.atptour.com/en/scores/archive/atlanta/6116/2021/results', 'https://www.atptour.com/en/scores/archive/kitzbuhel/319/2021/results', 'https://www.atptour.com/en/scores/archive/washington/418/2021/results', 'https://www.atptour.com/en/scores/archive/toronto/421/2021/results', 'https://www.atptour.com/en/scores/archive/cincinnati/422/2021/results', 'https://www.atptour.com/en/scores/archive/winston-salem/6242/2021/results', 'https://www.atptour.com/en/scores/archive/us-open/560/2021/results', 'https://www.atptour.com/en/scores/archive/nur-sultan/9410/2021/results', 'https://www.atptour.com/en/scores/archive/metz/341/2021/results', 'https://www.atptour.com/en/scores/archive/laver-cup/9210/2021/results', 'https://www.atptour.com/en/scores/archive/san-diego/9569/2021/results', 'https://www.atptour.com/en/scores/archive/sofia/7434/2021/results', 'https://www.atptour.com/en/scores/archive/chengdu/7581/2021/results', 'https://www.atptour.com/en/scores/archive/zhuhai/9164/2021/results', 'https://www.atptour.com/en/scores/archive/shanghai/5014/2021/results', 'https://www.atptour.com/en/scores/archive/beijing/747/2021/results', 'https://www.atptour.com/en/scores/archive/tokyo/329/2021/results', 'https://www.atptour.com/en/scores/archive/indian-wells/404/2021/results', 'https://www.atptour.com/en/scores/archive/moscow/438/2021/results', 'https://www.atptour.com/en/scores/archive/antwerp/7485/2021/results', 'https://www.atptour.com/en/scores/archive/vienna/337/2021/results', 'https://www.atptour.com/en/scores/archive/st-petersburg/568/2021/results', 'https://www.atptour.com/en/scores/archive/basel/328/2021/results', 'https://www.atptour.com/en/scores/archive/paris/352/2021/results', 'https://www.atptour.com/en/scores/archive/stockholm/429/2021/results', 'https://www.atptour.com/en/scores/archive/intesa-sanpaolo-next-gen-atp-finals/7696/2021/results', 'https://www.atptour.com/en/scores/archive/nitto-atp-finals/605/2021/results']
    
  • Note : You have to add the following imports :

    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.common.by import By
    from selenium.webdriver.support import expected_conditions as EC
    
0
On

Here's the updated code to add into your base code:

from selenium.webdriver.common.by import By
tournaments_2021_url = "https://www.atptour.com/en/scores/results-archive?year=2021"
self.driver.get(tournaments_2021_url)
tournaments_2021_url_list = self.driver.find_elements(By.XPATH, "//a[@class='button-border']")
print("\nTournament URLs:\n")
for row in tournaments_2021_url_list:
    print(row.get_attribute("href"))
print("\nNumber of rows:")
print(len(tournaments_2021_url_list))

Here's the output after running everything:

Tournament URLs:

https://www.atptour.com/en/scores/archive/delray-beach/499/2021/results
https://www.atptour.com/en/scores/archive/antalya/9426/2021/results
https://www.atptour.com/en/scores/archive/auckland/301/2021/results
https://www.atptour.com/en/scores/archive/melbourne/8998/2021/results
https://www.atptour.com/en/scores/archive/melbourne/9428/2021/results
https://www.atptour.com/en/scores/archive/pune/891/2021/results
https://www.atptour.com/en/scores/archive/atp-cup/8888/2021/results
https://www.atptour.com/en/scores/archive/australian-open/580/2021/results
https://www.atptour.com/en/scores/archive/new-york/424/2021/results
https://www.atptour.com/en/scores/archive/rio-de-janeiro/6932/2021/results
https://www.atptour.com/en/scores/archive/singapore/9460/2021/results
https://www.atptour.com/en/scores/archive/cordoba/9158/2021/results
https://www.atptour.com/en/scores/archive/montpellier/375/2021/results
https://www.atptour.com/en/scores/archive/rotterdam/407/2021/results
https://www.atptour.com/en/scores/archive/buenos-aires/506/2021/results
https://www.atptour.com/en/scores/archive/doha/451/2021/results
https://www.atptour.com/en/scores/archive/marseille/496/2021/results
https://www.atptour.com/en/scores/archive/santiago/8996/2021/results
https://www.atptour.com/en/scores/archive/dubai/495/2021/results
https://www.atptour.com/en/scores/archive/acapulco/807/2021/results
https://www.atptour.com/en/scores/archive/miami/403/2021/results
https://www.atptour.com/en/scores/archive/marrakech/360/2021/results
https://www.atptour.com/en/scores/archive/cagliari/9481/2021/results
https://www.atptour.com/en/scores/archive/marbella/9462/2021/results
https://www.atptour.com/en/scores/archive/houston/717/2021/results
https://www.atptour.com/en/scores/archive/monte-carlo/410/2021/results
https://www.atptour.com/en/scores/archive/barcelona/425/2021/results
https://www.atptour.com/en/scores/archive/belgrade/5053/2021/results
https://www.atptour.com/en/scores/archive/estoril/7290/2021/results
https://www.atptour.com/en/scores/archive/munich/308/2021/results
https://www.atptour.com/en/scores/archive/madrid/1536/2021/results
https://www.atptour.com/en/scores/archive/rome/416/2021/results
https://www.atptour.com/en/scores/archive/geneva/322/2021/results
https://www.atptour.com/en/scores/archive/lyon/7694/2021/results
https://www.atptour.com/en/scores/archive/parma/9510/2021/results
https://www.atptour.com/en/scores/archive/belgrade/9512/2021/results
https://www.atptour.com/en/scores/archive/roland-garros/520/2021/results
https://www.atptour.com/en/scores/archive/s-hertogenbosch/440/2021/results
https://www.atptour.com/en/scores/archive/stuttgart/321/2021/results
https://www.atptour.com/en/scores/archive/halle/500/2021/results
https://www.atptour.com/en/scores/archive/london/311/2021/results
https://www.atptour.com/en/scores/archive/mallorca/8994/2021/results
https://www.atptour.com/en/scores/archive/eastbourne/741/2021/results
https://www.atptour.com/en/scores/archive/wimbledon/540/2021/results
https://www.atptour.com/en/scores/archive/hamburg/414/2021/results
https://www.atptour.com/en/scores/archive/newport/315/2021/results
https://www.atptour.com/en/scores/archive/bastad/316/2021/results
https://www.atptour.com/en/scores/archive/los-cabos/7480/2021/results
https://www.atptour.com/en/scores/archive/gstaad/314/2021/results
https://www.atptour.com/en/scores/archive/umag/439/2021/results
https://www.atptour.com/en/scores/archive/tokyo/96/2021/results
https://www.atptour.com/en/scores/archive/atlanta/6116/2021/results
https://www.atptour.com/en/scores/archive/kitzbuhel/319/2021/results
https://www.atptour.com/en/scores/archive/washington/418/2021/results
https://www.atptour.com/en/scores/archive/toronto/421/2021/results
https://www.atptour.com/en/scores/archive/cincinnati/422/2021/results
https://www.atptour.com/en/scores/archive/winston-salem/6242/2021/results
https://www.atptour.com/en/scores/archive/us-open/560/2021/results
https://www.atptour.com/en/scores/archive/nur-sultan/9410/2021/results
https://www.atptour.com/en/scores/archive/metz/341/2021/results
https://www.atptour.com/en/scores/archive/laver-cup/9210/2021/results
https://www.atptour.com/en/scores/archive/san-diego/9569/2021/results
https://www.atptour.com/en/scores/archive/sofia/7434/2021/results
https://www.atptour.com/en/scores/archive/chengdu/7581/2021/results
https://www.atptour.com/en/scores/archive/zhuhai/9164/2021/results
https://www.atptour.com/en/scores/archive/shanghai/5014/2021/results
https://www.atptour.com/en/scores/archive/beijing/747/2021/results
https://www.atptour.com/en/scores/archive/tokyo/329/2021/results
https://www.atptour.com/en/scores/archive/indian-wells/404/2021/results
https://www.atptour.com/en/scores/archive/moscow/438/2021/results
https://www.atptour.com/en/scores/archive/antwerp/7485/2021/results
https://www.atptour.com/en/scores/archive/vienna/337/2021/results
https://www.atptour.com/en/scores/archive/st-petersburg/568/2021/results
https://www.atptour.com/en/scores/archive/basel/328/2021/results
https://www.atptour.com/en/scores/archive/paris/352/2021/results
https://www.atptour.com/en/scores/archive/stockholm/429/2021/results
https://www.atptour.com/en/scores/archive/intesa-sanpaolo-next-gen-atp-finals/7696/2021/results
https://www.atptour.com/en/scores/archive/nitto-atp-finals/605/2021/results

Number of rows:
78

This also has the added benefit of avoiding deprecation warnings. driver.find_elements_by_xpath is "deprecated", which displays a warning message after running pytest. The newer driver.find_elements(By.XPATH, XPATH) avoids that, although it does add an extra import line, from selenium.webdriver.common.by import By, to the code.