Transpose a correlation matrix into one long vector in SAS

928 Views Asked by At

I'm trying to turn a correlation matrix into one long column vector such that I have the following structure

data want;
input _name1_$ _name2_$ _corr_;
datalines;
var1 var2 0.54
;
run;

I have the following code, which outputs name1 and corr; however, I'm struggling to get name2!

DATA TEMP_1
    (DROP=I J);
    ARRAY VAR[*] VAR1-VAR10;
    DO I = 1 TO 10;
        DO J = 1 TO 10;
            VAR(J) = RANUNI(0);
        END;
        OUTPUT;
    END;
RUN;

PROC CORR
    DATA=TEMP_1
    OUT=TEMP_CORR
        (WHERE=(_NAME_ NE " ")
         DROP=_TYPE_)
    ;
RUN;

PROC SORT DATA=TEMP_CORR; BY _NAME_; RUN;

PROC TRANSPOSE
    DATA=TEMP_CORR
    OUT=TEMP_CORR_T
    ;
    BY _NAME_;
RUN;

Help is appreciated

2

There are 2 best solutions below

1
On BEST ANSWER

You're close. You're running into a weird issue with the name variable because that becomes a variable out of PROC TRANSPOSE as well. If you rename it, you get what you want. I also list the variables explicitly and add some RENAME data set options to get what you likely want.

PROC TRANSPOSE
DATA=TEMP_CORR (rename=_name_ = Name1)
OUT=TEMP_CORR_T (rename = (_name_ = Name2 col1=corr))
;
by name1;
var var1-var10;
RUN;

Edit: If you don’t want duplicates you can add a WHERE to the OUT dataset.

PROC TRANSPOSE
DATA=TEMP_CORR (rename=_name_ = Name1)
OUT=TEMP_CORR_T (rename = (_name_ = Name2 col1=corr) where = name1 > name2)
;
by name1;
var var1-var10;
RUN;
0
On

Just an ARRAY with VNAME() function. To just output the upper triangle set lower bound of DO loop to _N_.

data want ;
  length _name1_ _name2_ $32 _corr_ 8 ;
  keep _name1_ _name2_ _corr_;
  set corr;
  where _type_ = 'CORR';
  array x _numeric_;
  _name1_=_name_;
  do i=_n_ to dim(x);
    _name2_ = vname(x(i));
    _corr_ = x(i);
    output;
  end;
run;