Loading ChromeDriver Extensions or Proxy ChromeDriver and Elixir

507 Views Asked by At

I am writing a web scraper that I am trying to proxy, but can't quite figure out how to do it in Elixir.

I am using Hound running on top of a headless ChromeDriver. I purchased some proxy IPs through https://luminati.io and they offer both a chrome extension and a user/password base proxy server.

The webscraper actions comprise of a GenServer that represent a user scraping the web. There is no front end of the app, it accepts commands that are sent to it through a bot I built on Telegram, so when a user sends the login command for instance it triggers the login function of the GS.

At that point the GenServer will change the ChromeDriver session using Hound.change_session_to/2 and then log the user in.

This works great, but now I want to send every request through the proxy server via username and password. When changing the session with Hound, it allows the chromeOptions to be set as well.

ua = "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/37.0.2062.120 Safari/537.36"
change_session_to(String.to_atom(account.username), %{browserName: "chrome", chromeOptions: %{"args" => ["--user-agent=#{ua}", "--proxy-server=http://user:[email protected]:22225"]}})
navigate_to "https://www.website.com/"

Another thing that I have tried doing is loading luminati's ChromeExtension that I would be able to use to proxy the traffic through, but I can't get the extension to load for each session. I downloaded the packed CRM chrome extension and placed it within my priv folder. When the session loads it seems to load the User Agent just fine, but the extension never starts. When I am trying to load the extension I am not running headless.

ua = "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/37.0.2062.120 Safari/537.36"
priv_dir = :code.priv_dir(:boost_buddy)
change_session_to(String.to_atom(account.username), %{browserName: "chrome", 
chromeOptions: %{"extensions" => ['#{priv_dir}/luminati/3.2_1'], "args" => ["-
-user-agent=#{ua}", "--proxy-server=http://user:[email protected]:22225"]}})
navigate_to "https://www.website.com/"

Does anyone have experience using chrome driver with Elixir? With Ruby and Java setting up the extension is typically no problem.

1

There are 1 best solutions below

2
On

https://github.com/GoogleChrome/puppeteer/issues/659

-1 because this was the top result for googling "chrome headless extension"

Regarding sending each request through the proxy, I think you either need to interface with the chrome driver yourself (hijacking hound) or skip hound and use either chrome directly or through a selenium grid.

I think the issue stems from the fact that hound will initiate one single chrome instance, where the proxy settings will be defined. Further requests are done using that proxy.

So in order to achieve multiple proxy connections for different sessions you either need a way to set them through navigational steps (visiting a proxy website that then serves as a hard proxy) or use different browser instances altogether (I might be wrong though and perhaps there's an easier way of proxying the requests)