I can't click the xpath address after 2 iteration

34 Views Asked by At

I am currently utilizing Python along with the Selenium library for web automation, specifically aiming to extract data from Amazon. My current code successfully navigates to the search bar by simulating mouse input and lists all the products available on the page. However, I am encountering an inconsistency with a loop designed to iteratively access each product on the page using their respective XPath addresses. The loop sometimes executes as intended but occasionally fails to do so. I am seeking assistance in troubleshooting and resolving this issue. Any guidance or insights would be greatly appreciated.

from selenium import webdriver
from time import sleep
from selenium.webdriver.common.by import By
import time
import datetime
import requests
from bs4 import BeautifulSoup
import pandas as pd

path='C://chromedriver.exe'
catalog=["mouse"]


columns_for_amazon=[catalog[0],"Brand Name", "Product Name","Price","Star","NumberOfReviews","Comments","Date_Collected","Time_Collected"]
#exist_pf=pd.read_csv("C:\\Users\\USER\\Desktop\\software engineering\\proje klasörü\\main.csv")
new_df=pd.DataFrame(columns=columns_for_amazon)

options = webdriver.ChromeOptions()
options.add_experimental_option("useAutomationExtension", False)
options.add_experimental_option("excludeSwitches",["enable-automation"])

browser=webdriver.Chrome(options=options)
browser.maximize_window()
browser.get("https://www.amazon.com/")

#this is for Amaozon's secure 
"""
sleep(5)
enter_button=browser.find_element(By.XPATH,'/html/body/div/div[1]/div[3]/div/div/form/div[2]/div/span/span/button')
enter_button.click()
"""

#get input elements
input_search=browser.find_element(By.XPATH,'//*[@id="twotabsearchtextbox"]')
search_button=browser.find_element(By.XPATH,'//*[@id="nav-search-submit-button"]')

#send input to webpage
input_search.send_keys(catalog[0])
search_button.click()


for i in range(2,10):
    #*****************************************************************
    #this is the problem i am encounter
    try:
        log=browser.find_element(By.XPATH,f'//*[@id="search"]/div[1]/div[1]/div/span[1]/div[1]/div[{i}]/div/div/div/div/span/div/div/div/div[2]/div/div/div[1]/h2/a/span')
        print(i)
    except:
        print(i,"numbered iteration is NOT finished")
        continue 
    sleep(4)
    #*****************************************************************    
    log.click()
    #scraping of all valuable informations
    page=requests.get(browser.current_url)
    soup1=BeautifulSoup(page.content,"html.parser") #takes every html code of page
    soup2=BeautifulSoup(soup1.prettify(),"html.parser")
    try:
        title = soup2.find(id='bylineInfo').get_text()
    except:
        continue 
    title=title.strip()

    product=soup2.find(id='productTitle').get_text()
    product=product.strip()


    star=soup2.find(class_='a-popover-trigger a-declarative').get_text()
    star=star.strip()[0:3]

    price=soup2.find(class_='a-offscreen').get_text()
    price=price.strip()

    no_reviews=soup2.find(id='acrCustomerReviewText').get_text()
    no_reviews=no_reviews.strip()

    comments=soup2.find(class_='a-section review-views celwidget').get_text()
    comments_list=list()
    for i in comments.split("\n"):
        comments_list.append(i.strip())
        if comments_list[-1]=="":
            comments_list.pop()

    dtime= datetime.datetime.now()
    log_date= dtime.strftime("%x")
    log_time=dtime.strftime("%X")
    log_list=[catalog[0],title,product,price,star,no_reviews,comments_list,log_date,log_time]

    lenght=len(new_df)
    new_df.loc[lenght]=log_list

    print(i,"numbered iteration is finished")
    browser.back()


new_df.to_csv('C:\\Users\\USER\\Desktop\\software engineering\\proje klasörü\\main_mouse_otomatize.csv')

This is my output

[.... ERROR:cert_issuer_source_aia.cc(35)] Error parsing cert retrieved from AIA (as DER):
ERROR: Couldn't read tbsCertificate as SEQUENCE
ERROR: Failed parsing Certificate

 numbered iteration is finished
3
 numbered iteration is finished
4 numbered iteration is NOT finished
5 numbered iteration is NOT finished
6 numbered iteration is NOT finished
7 numbered iteration is NOT finished
8 numbered iteration is NOT finished
9 numbered iteration is NOT finished
1

There are 1 best solutions below

1
ODMT On BEST ANSWER

It sounds like you're encountering an issue with clicking an XPath address in your code after two iterations. This could be due to various reasons, such as the XPath address changing dynamically, an issue with the loop or iteration logic, or the element becoming unavailable or unclickable after the first two iterations.

To troubleshoot this issue, you can try the following steps:

Inspect the XPath: Make sure that the XPath address you're using is correctly targeting the element you want to click. Sometimes, dynamic changes in the webpage structure can cause XPath addresses to become invalid after a few iterations.

Check Element Availability: Verify that the element you're trying to click is available and clickable after each iteration. Use appropriate waits or conditions to ensure that the element is fully loaded and ready to be interacted with.

Debugging Loop Logic: Review your loop or iteration logic to ensure that it's correctly iterating over the desired elements and not skipping or repeating any.

Dynamic Changes: If the webpage content is dynamic, consider using more robust methods for locating elements, such as CSS selectors or alternative attributes, instead of XPath.

Inspect Console Errors: Check the browser console for any errors or warnings that might provide insights into why the element is not clickable after two iterations.

Use Explicit Waits: Implement explicit waits to ensure that the element is clickable before attempting to click it, especially if the page content is dynamically loaded.

By carefully reviewing these aspects of your code, you should be able to identify and resolve the issue with clicking the XPath address after two iterations