I am trying to write a code that will allow me to download a .xls file from a secured https website which requires a login. This is very difficult for me, as i have no experience with web-coding--all my R experience comes from econometric work with readily available datasets.
i followed this thread to help write some code, but i think im running into trouble because the example is http, and i need https.
this is my code:
install.packages("RCurl")
library(RCurl)
curl = getCurlHandle()
curlSetOpt(cookiejar = 'cookies.txt', followlocation = TRUE, autoreferer = TRUE, curl = curl)
html <- getURL('https://jump.valueline.com/login.aspx', curl = curl)
viewstate <- as.character(sub('.*id="_VIEWSTATE" value="([0-9a-zA-Z+/=]*).*', '\\1', html))
params <- list(
'ct100$ContentPlaceHolder$LoginControl$txtUserID' = 'MY USERNAME',
'ct100$ContentPlaceHolder$LoginControl$txtUserPw' = 'MY PASSWORD',
'ct100$ContentPlaceHolder$LoginControl$btnLogin' = 'Sign In',
'_VIEWSTATE' = viewstate)
html <- postForm('https://jump.valueline.com/login.aspx', .params = params, curl = curl)
when i get to running the piece that starts "html <- getURL(..." i get:
> html <- getURL('https://jump.valueline.com/login.aspx', curl = curl)
Error in function (type, msg, asError = TRUE) :
SSL certificate problem: unable to get local issuer certificate
is there a workaround for this? how am i able to access the local issuer certificate?
I read that adding '.opts = list(ssl.verifypeer = FALSE)' into the curlSetOpt would remedy this, but when i add that, the getURL runs, but then postForm line gives me
> html <- postForm('https://jump.valueline.com/login.aspx', .params = params, curl = curl)
Error: Internal Server Error
Besides that, does this code look like it will work given the website i am trying to access? I went into the inspector, and changed all the params to be correct for my webpage, but since i'm not well versed in webcoding i'm not 100% i caught the right things (particularly the VIEWSTATE). Also, is there a better, more efficient way i could approach this?
automating this process would be huge for me, so your help is greatly appreciated.
This should work for you. The error you're getting is probably because libcurl doesn't know where to look for to get a certificate for SSL.