SAS - Unzip all .gz files in a folder

448 Views Asked by At

SAS Program - I am trying to unzip all .gz type files in a folder and reading them to a dataset using filename statement. However, I am not able to make it work.

I have .gz files named as follows in a folder:

EQY_US_ALL_TRADE_20210701
EQY_US_ALL_TRADE_20210702
EQY_US_ALL_TRADE_20210705
EQY_US_ALL_TRADE_20210706
EQY_US_ALL_TRADE_20210707
.....
.....
EQY_US_ALL_TRADE_20210729
EQY_US_ALL_TRADE_20210730
so on.

Note that it does NOT have files for all 31 days in a folder - files are only for business days. See my code below:

/* Change working directory to where all the files are located */
data _null_; 
      rc=dlgcdir("C:\EQY_US_ALL_TRADE_202107");
      put rc=;
   run;

/* using filename statement unzip all files and read them into "f1" */

filename f1 zip EQY_US_ALL_TRADE_202107* gzip lrecl=500; 

/* This code worked when I had the actual name of one of the file - for e.g. "EQY_US_ALL_TRADE_20210702" but does not work when I use the wildcard to run through all of them */

1

There are 1 best solutions below

7
On

You can read the file names in the folder using the DREAD function, and then a dynamic INFILE statement using the FILEVAR= option to specify the gunzip stream that will be INPUTed from.

Example:

All gzipped files are presumed to be data only and do not contain a header row. The compressed files are all in a single folder and have the .gz file extension

data want(keep=filename a b c);
  length folderef $8;
  rc = filename (folderef, 'c:\temp\trade_data');

  did = dopen(folderef);

  do _n_ = 1 to dnum(did);

    filename = dread(did,_n_);
    if scan(filename,-1,'.') ne 'gz' then continue;

    fullname = pathname(folderef) || '/' || filename;

    do while(1);
      infile archive zip filevar=fullname gzip dlm=',' eof=nextfile;
      input a b c;

      OUTPUT;
    end;

nextfile:
  end;

  stop;
run;