Selenium returns an empty string crawling one specific domain

130 Views Asked by At

I'm trying to automatize keyword tracking using selenium and python. I am using an online tool called "I search from" to get results page and then try to scrape them to get current position for my domain. The code I am using is working fine for other domains and it returns a list of objects. I don't understand why the problem occurs, perhaps I am not targeting an element correctly but I can't find a solution myself. I would be really grateful for any tip that would push me towards a solution. I tried all possible ways to target elements by id, xpath or class name and nothing works.

Best regards!

    from selenium import webdriver
    from selenium.webdriver.common.keys import Keys
    from selenium.webdriver.support.select import Select
    import time
    
    driver = webdriver.Firefox(executable_path="C:\Drivers\geckodriver.exe")
    
    keywords = ['random keyword']
    
    driver.get("https://www.google.com")
    
    newGoogleCookie = {'name': 'CONSENT', 'value': 'YES+PL.pl+V12+B', 'path': '/', 'domain': '.google.com', 'secure': False, 'httpOnly': False, 'expiry': 2145916800, 'sameSite': 'None'}
    driver.add_cookie(newGoogleCookie)
    
    driver.get('https://www.isearchfrom.com')
    
    CountryInput = driver.find_element_by_id('countrytags')
    LanguageInput = driver.find_element_by_id('languagetags')
    deviceDropdown = driver.find_element_by_id('deviceselect')
    
    CountryInput.send_keys("Poland")
    LanguageInput.send_keys("Polish")
    deviceDropdown.send_keys('Android phone')
    
    searchedKeywordInput = driver.find_element_by_id('searchinput')
    for keyword in keywords:
        searchedKeywordInput.send_keys(keyword)
        SearchButton = driver.find_element_by_id('searchbutton').click()
        results = driver.find_elements_by_xpath('//*[@id="dimg_1"]')
        print(results)
2

There are 2 best solutions below

2
On BEST ANSWER

The locator you are using //*[@id="dimg_1"] is for images in search results. This may not be true for all search results. When you are searching for "random keyword" the search results do not have any images due to which the length of list results is 0.

You need to modify the locator to be a more generic one and which works even when the search results do not have any images. Perhaps you could use something like this: //div[@class="kCrYT"]/a/h3/div. This will get you all webElements for search result headers

Update: The issue in that case is you are not switching to the new window/tab that opens. Please try the following code instead:

driver.get('http://www.isearchfrom.com')

CountryInput = driver.find_element_by_id('countrytags')
LanguageInput = driver.find_element_by_id('languagetags')
deviceDropdown = driver.find_element_by_id('deviceselect')

CountryInput.send_keys("Poland")
LanguageInput.send_keys("Polish")
deviceDropdown.send_keys('Android phone')

searchedKeywordInput = driver.find_element_by_id('searchinput')
parentGUID = driver.window_handles[0]
for keyword in keywords:
    searchedKeywordInput.send_keys(keyword)
    SearchButton = driver.find_element_by_id('searchbutton').click()
    window_after=driver.window_handles[1]
    driver.switch_to_window(window_after)
    results = driver.find_elements_by_xpath('//div[@class="kCrYT"]/a/h3/div')
    print(len(results))
    print(results)

0
On

So apparently you were right. I was scraping the first opened tab all the time. I fixed it by adding this line to the loop that was supposed to run scraping for each result: driver.switch_to.window(driver.window_handles[-1])

Thanks a lot

*This comment has been edited, previously I stated that that is not the case, but it was.