python seleniumbase threading not working

158 Views Asked by At
from seleniumbase import SB

def process_request(proxy, email, available_proxies):
    with SB(uc=True, headless=False, proxy=proxy) as sb:
        sb.open("https://google.com/")

        try:
            is_loaded = sb.assert_element('div[class="css-y8g9ca"]', timeout=4)
        except Exception as e:
            print(e)
            print(f"Bad proxy: {proxy}")

            # Remove the bad proxy from the list
            sb.quit()
            available_proxies.remove(proxy)

            if available_proxies:
                # Fetch a new proxy and retry
                new_proxy = get_new_proxy(available_proxies)
                print(f"Retrying with a new proxy: {new_proxy}")
                process_request(new_proxy, email, available_proxies)
            else:
                print("No more proxies available.")
            return

        sb.type('input[id="username"]', email)
        sb.click('button:contains("Next")')

        try:
            valid = sb.assert_element('div[class="css-i76c7y"]', timeout=5)
            
            if valid:
                print(f"Valid email {email} with proxy {proxy}")
        except Exception as e:
            print(e)
            print(f"Invalid email {email}")
            sb.quit()
        sb.quit()

def get_new_proxy(available_proxies):
    # Implement logic to fetch a new proxy from the list of available proxies
    # For simplicity, let's assume you have a list of proxies stored in a file
    if available_proxies:
        return available_proxies[0]
    else:
        return None

if __name__ == "__main__":
    proxies = [i.rstrip() for i in open("proxies.txt", "r+")]
    emails = [i.rstrip() for i in open("emails.txt", "r+")]

    # Create a copy of the proxies list to track available proxies
    available_proxies = list(proxies)

    # Adjust the number of threads based on your requirements
    num_threads = 3

    with ThreadPoolExecutor(max_workers=num_threads) as executor:
        # Use zip to iterate over both proxies and emails in parallel
        futures = [executor.submit(process_request, proxy, email, available_proxies) for proxy, email in zip(proxies, emails)]

        # Wait for all threads to complete
        for future in futures:
            future.result()

the problem I have is that threading works with uc=False but not with uc=True, I've tried doing multiple methods. I tried using multiple user profiles and I tried Multi threading using seleniumbase driver

the problem I had with this solution is that it doesn't allow arguments and it only does one process. I also have tried to do multiprocessing but this doesn't work either.

the browser does start but because the sessions are interfering (I think) the page never loads

1

There are 1 best solutions below

0
On

Proxy with auth with SeleniumBase uses the solution from https://stackoverflow.com/a/35293284/7058266, where essentially a zip file is created that contains the proxy credentials that will be used for proxying, and then SeleniumBase loads that extension into Chrome. The default setting assumes that only a single proxy is used, therefore, if the zip file already exists it will get overwritten, saving space/memory. In the case that you need multiple simultaneous proxies, there's an arg that you need to set: multi_proxy=True, which then creates a uniquely-named zip file for each test that uses a proxy.

Here's a sample script that uses that:

from parameterized import parameterized
from seleniumbase import BaseCase
BaseCase.main(__name__, __file__, "-n3")

class ProxyTests(BaseCase):
    @parameterized.expand(
        [
            ["user1:pass1@host1:port1"],
            ["user2:pass2@host2:port2"],
            ["user3:pass3@host3:port3"],
        ]
    )
    def test_multiple_proxies(self, proxy_string):
        self.get_new_driver(
            undetectable=True, proxy=proxy_string, multi_proxy=True
        )
        self.driver.get("https://browserleaks.com/webrtc")
        self.sleep(30)

Also note that pytest multithreading (via pytest-xdist) is the only officially supported way of doing multithreading with SeleniumBase, as the built-in thread-locks are specially designed for multiple threads via pytest. Theoretically, you could do multithreading without using pytest-xdist if you made sure that you only launched one Chrome browser at a time (but once launched, you could have several instances running at once).