I am trying to have a proc glm function for multiple variables. When I run the code, it takes a few minutes to process then the sas studio crashes and log me out.
Where did I make a mistake:
proc glm data=diet;
class age sex smokstat bmi vituse cal fat fiber alcohol chol betadiet retdiet;
model betaplsama=age sex smokstat bmi vituse cal fat fiber alcohol chol betadiet retdiet;
model retplasma=age sex smokstat bmi vituse cal fat fiber alcohol chol betadiet retdiet;
means age sex smokstat bmi vituse cal fat fiber alcohol chol betadiet retdiet / schefee;
run;
I need to get analysis of variance. The example code that professor provided:
data twoway;
input class1 $ class2 $ y;
datalines;
;
proc glm data=twoway;
class class1 class2;
model y=class1 class2 class1*class2; /*(model y=class1|class2)*/
means class1 class2 class1*class2 / scheffe; /*(means class1|class2)*/
run;
You've specified every single variable as a class (or, categorical) variable. Given the name of the dataset and what you're trying to predict, it is unlikely that variables like BMI, Age, Calories and Fat are categories. You're basically telling SAS to treat each individual value as its own level and one-hot encode it. This blowing up the design matrix and probably making it astronomically large. Even if it did complete, your regression would likely fail because you cannot have p > n in a GLM.
Identify which are your categorical variables, then add them to the
classstatement. You do not need to do this for numeric variables. For example:The
classstatement tells SAS to treatorigin,make, andcylindersas categorical variables and automatically one-hot encode them.weight,wheelbase, andmsrpare treated as numeric by default.