How can I download a file from a site that is protected with a username and password, using the Selenium framework?

140 Views Asked by At

I am trying to download a file from a site that is protected with a username and password, using Selenium.

First, I got the href attribute from the download link:

WebElement downloadLinkElement = htmlElement.findElement(By.xpath(<xpath_value>));
    String url = downloadLinkElement.getAttribute("href");

Secondly, I got the "AUTHSESSION" cookie using the Selenium web driver:

org.openqa.selenium.Cookie cookie = webDriver.manage().getCookieNamed("AUTHSESSION");

Then I built a string that includes a Linux "wget" command, like so (I'm using apache commons exec artifact for this):

CommandLine cmdLine = new CommandLine("wget");
        cmdLine.addArgument("--cookies=on");
        cmdLine.addArgument("--header");
        cmdLine.addArgument("Cookie: AUTHSESSION=" + cookie.getValue());
        cmdLine.addArgument("-O");
        cmdLine.addArgument("/home/name/Downloads/file.ftl");
        cmdLine.addArgument(url);
        cmdLine.addArgument("--no-check-certificate");

Finally, I execute the command, and extract the execution output:

DefaultExecutor executor = new DefaultExecutor();
        ByteArrayOutputStream byteArrayOutputStream = new ByteArrayOutputStream();
        PumpStreamHandler streamHandler = new PumpStreamHandler(byteArrayOutputStream);
        executor.setStreamHandler(streamHandler);
        try {
            executor.execute(cmdLine);
        }catch(ExecuteException ee){
            ee.printStackTrace();
            System.out.println(byteArrayOutputStream.toString());
        }

After the execution, the file is downloaded to the specified path. But it is not the desired one. It is an html file with the login page of the site I'm trying to download from.

The following is a string contained within the execution output:

WARNING: cannot verify <ip_address>'s certificate, issued by <company_details>
  Self-signed certificate encountered.
WARNING: no certificate subject alternative name matches
    requested host name ‘<ip_address>’.
HTTP request sent, awaiting response... 302 

An important note is that if I run the following command in the Linux terminal, the file is downloaded successfully:

wget --cookies on --header "Cookie: AUTHSESSION=<cookie_value>" -O "<download_path>" "<url>"
--no-check-certificate

What am I missing ?

1

There are 1 best solutions below

0
On BEST ANSWER

So, I changed the approach a bit.

Before I initialize the firefox web driver, I first create a FirefoxOptions object, like so:

    FirefoxOptions firefoxOptions = new FirefoxOptions();
    firefoxOptions.addPreference("browser.helperApps.neverAsk.saveToDisk", "text/plain");

After that, I insert this object within the firefox driver constructor:

WebDriver driver = new FirefoxDriver(firefoxOptions);

After you click on the download link, the file is stored in the disk, with no questions asked by the browser.