A few days before, my code for this website work perfectly:
url_TKLLPG_1 = 'https://truyenhdx.com/truyen/than-kiem-luu-lac-phap-gioi/chap/10323547-chuong-19/'
I used this to automatically refresh after 2 second interval and then find the next button and click on it:
def open_and_refresh_websiteVL(driver, url, interval_seconds, number_of_chapter = 0):
try:
driver.get(url)
time.sleep(6)
time.sleep(2)
except ConnectionRefusedError:
print(f'WebDriverException caught at {number_of_chapter}, url: {url}')
driver.quit()
new_driver = get_chrome_driver()
open_and_refresh_websiteVL(new_driver, url, interval_seconds, number_of_chapter)
except WebDriverException:
driver.quit()
return
count = 0
while(count<=2):
time.sleep(interval_seconds)
driver.refresh()
time.sleep(interval_seconds)
driver.refresh()
time.sleep(interval_seconds)
driver.refresh()
time.sleep(interval_seconds)
driver.refresh()
time.sleep(interval_seconds)
count+=1
if(number_of_chapter > 80):
driver.quit()
if(driver.find_elements(By.XPATH, '//*[@id="html"]/body/div[5]/div/div/div[8]/div[3]/a')!=[]):
buttonNext = driver.find_element(By.XPATH, '//*[@id="html"]/body/div[5]/div/div/div[8]/div[3]/a').click()
nexturl = driver.current_url
open_and_refresh_websiteVL(driver,nexturl, interval_seconds, number_of_chapter+1)
and here is my chrome options setting (this is based on multiple research and solutions that I found online):
def get_chrome_driver():
ua = UserAgent()
user_agent = ua.random
print(user_agent)
chrome_options = webdriver.ChromeOptions()
# Adding argument to disable the AutomationControlled flag
chrome_options.add_argument("--disable-blink-features=AutomationControlled")
# Exclude the collection of enable-automation switches
chrome_options.add_experimental_option("excludeSwitches", ["enable-automation"])
# Turn-off userAutomationExtension
chrome_options.add_experimental_option("useAutomationExtension", False)
chrome_options.add_argument(f'user-agent={user_agent}')
driver = webdriver.Chrome(options=chrome_options, service=ChromeService(ChromeDriverManager().install()))
return driver
This is how I called the function:
refresh_interval = 2
driver = get_chrome_driver()
open_and_refresh_websiteVL(driver, url_TKLLPG_1, refresh_interval)
But now, even a simple driver.get(url) doesn't work anymore. The page load for like 1 second and it automatically return back to data:, url. No matter how much I tried to get the above url, the page keep returning back to data:, url. The weird thing is that it only happens on that specifc website, as my code for other website still working as intended, and I still access that site normally, just not when I use selenium. Am I doing something wrong, or did that website know something about me using selenium?
I tried random Agent User, added relevant chrome_options. I even tried a normal driver.get(url) without any chrome options, but the website keep return to data:, url after driver.get(url)