PROC FREQ in SAS gives too wide columns

457 Views Asked by At

I did a simple proc freq in SAS:

PROC FREQ DATA=test;
     a * b;
RUN;

This raised the error: insufficient page size to print frequency table

From ERROR: Insufficient page size to print frequency table in SAS PROC FREQ I learned that the error is fixed by enlarging the page size:

option pagesize=max;

But then my table still looked strange with super high white spaces in column b:

Frequency |
Percent   |
Row Pct   | value 1  | value 2  |
Col Pct   |          |          |
          |          |          |
            ...etc...  ...etc...
          |          |          |
----------+----------+----------+
a         |    12    |    3     |

What solved my problem was adding a format to the proc freq that truncated variable b.

PROC FREQ DATA=test;
     FORMAT B $7.;
     a * b;
RUN;

now my result looks like this and I'm happy enough:

Frequency |
Percent   |
Row Pct   |
Col Pct   | value 1  | value 2  |
----------+----------+----------+
a         |    12    |    3     |

I'm left a bit bewilderd, because nowhere in the code did I apply a format to b before, just a lenght statement. Other variables that had their lengths fixed did not have this problem. I did switch from an excel sourcefile to oracle-exadata as source. Is it possible that Oracle pushes variable formats to SAS?

1

There are 1 best solutions below

0
Tom On

SAS has a nasty habit of attaching formats to character variables pulled from external databases, including PROC IMPORT from an EXCEL file. So if a character variable has a storage length of 200 then SAS will also attach the $200. format to the variable.

When you combine two dataets that both contain the same variable the length will be set by the first version of the variable seen. But the format attached will be set by the first non-empty format seen. So you could combine a dataset where A has length $10 and no format attached with another dataset where A has the format $200. attached and the result will a variable with an actual length of 10 but the $200. format attached.

You can use the format statement where you list variable names but no format specification to remove them. You could do it in the PROC step.

PROC FREQ DATA=test;
  tables  a * b;
  format _character_ ;
RUN;

Or do it in a data step or use PROC DATASETS to modify the formats attached to the variable in an existing dataset.