SAS Call execute output changes every time I run it

623 Views Asked by At

I’m running a macro with a call execute in a data step. What I want from my data step is the following:

Take a table, add a new column for every existing column (via macro) and finally add a new column that is the sum of two others. I guess it is also possible without a macro, but I want it exactly this way cause I’m new to SAS and want to understand the logic of macros and call execute.

So let’s say I have the following table:

data values;
input a1 a2 b1 b2;
datalines;
1 0 3 10
0 5 6 11
7 8 9 0
;
run;

and this macro:

%macro loop1(myDataset);  
proc contents data=&myDataset. out=Col_Names (keep=Name) noprint;
run;
proc sql noprint; 
select count(Name) into :length from Col_Names;
quit;                                                 
     %do j = 1 %to &length; 
            data &myDataset.;
            set &myDataset.; 
            n&j=0;
            run;                       
     %end;
%mend;

then the following data step creates different output the first three times I run it: (After every run I re-run the original data step with the datalines of course)

data values;
set values;
if _n_=1 then call execute('%loop1(values);');
test=sum(a1,a2);
run;

The first run results in an error:

WARNING: Apparent symbolic reference LENGTH not resolved. ERROR: A character operand was found in the %EVAL function or %IF condition where a numeric operand is required. The condition was: &length ERROR: The %TO value of the %DO J loop is invalid. ERROR: The macro LOOP1 will stop executing.

The second run results in exactly what I want:

The columns a1, a2, b2, b2, test, n1, n2, n3, n4

And from the third run on, the output stays:

The columns a1, a2, b2, b2, test, n1, n2, n3, n4, n5

With the undesired n5 in it.

What should I change to always get the output from the second run?

3

There are 3 best solutions below

6
On BEST ANSWER

When calling macros using call execute it is recommended to wrap them in %nrstr() as per the following usage note:

http://support.sas.com/kb/23/134.html

This prevents premature macro execution - or at least, forces it to wait for any dependent macro variables to be ready, namely the length variable in your INTO: clause.

data values;
  set values;
  if _n_=1 then call execute('%nrstr(%loop1(values);)');
  test=sum(a1,a2);
run;

In order to get the results you desire, you also need to exclude the 'test' variable in your SQL procedure, as follows:

proc sql noprint; 
select count(Name) into :length from Col_Names
  where upcase(name) ne 'TEST';
2
On

Why it fails:

Your code does not at all add variables ´n1´ and so forth to the ´values´ dataset at the moment you think it does. It schedules your macro for after your last data step, and that is not what you want.

Try for instance adding an assignment to your last data step

data values;
    set values;
    *&new_columns;
    if _n_=1 then call execute('%loop1(values);');

    test=sum(a1,a2);
    n2=9874;
run;

and you will see it has no effect because your value for ´n2´ is overwritten when your macro runs;

What you can do:

With select <something> into :<variable> separated by <separator>, you can create a macro variable that contains the assignments.

proc contents data=values noprint out=Col_Names(keep=varnum);
run;
proc sql noprint;
    select 'n'|| strip(put(varnum, 8.)) ||'=0'
    into :new_columns separated by ';'
    from col_names;
quit;

You can use This variable in your data step:

data values;
    set values;
    &new_columns;

    test=sum(a1,a2);
    n2=9874; ** Now this has effect **;
run;
0
On

The reason it fails is that SAS is running the macro when your data step runs and pushing the code that the macro generates onto the stack. It then runs the actual generated PROC and DATA steps after the data step finishes.

So when your macro runs it generates the PROC SQL step, but the step does not execute yet since your data step is still running. The macro then runs the %DO loop and will either generate an error since the macro variable LENGTH doesn't exist or use the value for the macro variable that existed before your data step started.

To prevent this wrap the macro call in %NRSTR() so that the macro call itself is pushed onto the stack to run after the data step stops.

call execute('%nrstr(%loop1)(values);');

Your actual example does not need CALL EXECUTE at all. Just call the macro after the data step runs.

data values;
  set values;
  test=sum(a1,a2);
run;
%loop1(values);

Or if you don't want to make a flag variable for the new TEST variable then run the macro BEFORE the data step.