How to make remote webdriver execute longer than 15 minutes with selenium python?

386 Views Asked by At

I currently have my code and script running correctly while being able to get all the data I need scraping. My goal is to have my script run for hours and scrape a single page by using the webdriver to refresh every minute. However, this process only works for the first 15 minutes.

I run this on an was EC2 remote instance by running:

java -jar selenium-server-standalone-3.141.59.jar -port 4444 -sessionTimeout 57868143 &
python3 /home/ec2-user/scraper/football_live.py;

to start the selenium server (which runs longer than 15 minutes) and then the script.

Inside my script I have:

data, n_games = football_data(driver)
insert_data(cur, conn, data)
time.sleep(60)
driver.refresh()

inside a while loop that will run for a long period of time.

This is my webdriver code:

chrome_options = Options()
chrome_options.add_argument('--headless')
chrome_options.add_argument('--no-sandbox')
time.sleep(5)
driver = webdriver.Remote("http://localhost:4444/wd/hub", options=chrome_options, desired_capabilities=DesiredCapabilities.CHROME)

Here is the only thing I have found that is close to what I am trying to do but it is not all that helpful.

I am also considering just trying to run 15 minute loops in the script as a last resort if there is not a way to extend the duration of the webdriver through selenium.

0

There are 0 best solutions below