Python's threading.enumerate function returns an unexpected results

58 Views Asked by At

I am using multithreading to run multiple Selenium drivers at the same time. Every driver has an individual thread. When I have 70 driver instances (ThreadPoolExecutor(max_workers=70)) len(threading.enumerate()) must return 70 + 1 (Including MainThread), isn't it ? When process is running after some time len(threading.enumerate()) returns wrong result (80, 90 threads). I am using seleniumbase / undetected-chromedriver. Maybe these libraries are creating an extra threads ?

import concurrent.futures
import threading
import time
from seleniumbase import Driver


# Pseudocode
def run_driver(driver, url):
    try:
        driver.get(url=url)
        print(url, len(threading.enumerate()))  # Wrong result
        time.sleep(10)
    except Exception as exc:
        driver.close()
        driver.quit()


urls = ['https://google.com' for i in range(70)]
with concurrent.futures.ThreadPoolExecutor(max_workers=len(urls)) as executor:
    for url in urls:
        driver = Driver(uc=True, headed=True)
        executor.submit(run_driver, driver, url)
1

There are 1 best solutions below

3
Michael Mintz On

You have to put the driver.quit() in the finally block, otherwise the driver only gets quit in your script if there's an exception. I made some updates to the code, and added sys.argv.append("-n") so that the thread-locking is handled correctly in SeleniumBase UC Mode when patching the driver, etc.

import concurrent.futures
import threading
import time
from seleniumbase import Driver
import sys
sys.argv.append("-n")


def run_driver(url):
    driver = Driver(uc=True)
    try:
        driver.get(url=url)
        print(url, len(threading.enumerate()))
        time.sleep(3)
    finally:
        driver.quit()


urls = ['https://seleniumbase.io/demo_page' for i in range(8)]
with concurrent.futures.ThreadPoolExecutor(max_workers=len(urls)) as executor:
    for url in urls:
        executor.submit(run_driver, url)