SAS- Auto Login to a secure website , open links in it and fetch data

2.2k Views Asked by At

I am a SAS programmer whose trying to automate a report for my client who creates it by fetching some data from a secure website,manually noting it down on our SAS database and then reports/validates on them. My task is to reduce the manual effort.

I tried using FILENAME URL option in Sas code to login to the page but it ends up giving me error : "SSL Error : Missing CA trust list"

So basically the task is to pass a webpage in SAS, have the code auto - login to the website through login credentials , open up links on the webpage and fetch some amounts/date from the opened link pages.

Any helpful pages to start from for me would also be appreciated.

Thanks

4

There are 4 best solutions below

0
On

I've used successfully CURL tool with SAS (due some limitations in SAS 9.1)

The idea is to call curl via systask command from SAS as an external utility. Then curl does the magic - retrieve or post/upload some content etc.

7
On

As Vasja says CURL is the way to do it. In the example below I've used a named pipe instead of calling curl through systask. Here's an example of the code (though it's not for the faint of heart):

**
** GLOLBALS
*;
%let curl_executable = d:\xxx\curl.exe;
%let maxchars        = 3000;
%let userpass        = User:Pass;


data _null_;
  ** 
  ** USE THIS VERSION IF YOU JUST HAVE A STRAIGHT URL TO RETRIEVE
  *;
  call symput ('curl_cmd', "&curl_executable -k -u &userpass %nrstr(https://www.mysite.com/index.php" );
run;


** 
** USE THIS VERSION IF YOU NEED TO PASS IN PARAMETERS TO THE URL
**
** NOTE THAT THE CARROT (^) IS REQUIRED TO MASK THE AMPERSAND WHEN 
** BEING PROCESSED THROUGHT THE COMMAND LINE PIPE. THE LOG WILL STILL COMPLAIN ABOUT
** THE MACRO VARIABLE TODT NOT BEING RESOLVED EVEN THOUGH IT IS.
*;
/*%let from=2013-01-01;*/
/*%let to  =2013-01-01;*/
/*data _null_;*/
/*  call symput ('curl_cmd', "&curl_executable -k -u &userpass %nrstr(https://www.mysite.com/index.php?FromDT=)&from^%nrstr(&)ToDT=&to" ); */
/*run;*/


**
** CONNECT TO SITE AND RETURN HTML RESULT TO A DATASET
*;
filename curl pipe "&curl_cmd" lrecl=32767;
data tmp;
  length xml $&maxchars;

  infile curl truncover end=eof;
  input @1 xml $&maxchars..;

  /*
  ** SAFETY CHECK 1 - TEST FOR TRUNCATION. LINE SHOULD NOT BE > 32K.  
  ** THIS CAN BE FIXED BY PIPING THROUGH SED AND INSERTING LINE BREAKS IF NECESSARY.
  */
  if lengthn(xml) ge &maxchars then do;      
    put "ERROR: FAILED FOR &from BECAUSE XML WAS > &maxchars CHARS";
    put "OBSERVATION: " _n_;
    put xml;
    stop;
  end;

run;
filename curl clear;

Once you have this working and the HTML saved into your dataset you then need to parse out the results but that's a whole different question.

0
On

It is worth noting in newer versions of SAS, there is PROC HTTP that allows you to better handle requests (GET and POST) and responses from web servers. This is useful if the URL returns something that is not a flat file or easily read through a DATA Step.

There are explicit instructions on HTTPS here http://support.sas.com/documentation/cdl/en/proc/64787/HTML/default/viewer.htm#p19qqqyq9te9u3n1oq8fbdta7dqt.htm

4
On

The correct way to login to a secure site is given in the documentation :

filename foo url 'https://www.b.com/file1.html'
     user='jones' prompt;

If you're having trouble with that with the error above, it's possible you have a malconfigured SAS installation or OS. See this reference, in particular the SSLCERT options in section 2, for more specific details useful for your site.