I'm trying to apply a multivariate Cox regression analysis in R
to my dataset, following this tutorial.
In particular, I am trying to apply the following function coxph()
:
install.packages(c("survival", "survminer"));
library("survival");
library("survminer");
data("lung");
res.cox <- coxph(Surv(time, status) ~ age + sex + ph.ecog, data = lung)
summary(res.cox)
As you can see, in this case the names of the features (age + sex + ph.ecog
) have been inserted manually in the formula.
In my case, instead, I have thousands of features, so I cannot add their names manually. I need to find a way to insert them in an automated way. I tried to do it on the previous case, with no success. Here's what I tried:
featureNames <- paste(colnames(lung), collapse = " + ")
res.cox <- coxph(Surv(time, status) ~ featureNames, data = lung)
And I got this error message:
Error in model.frame.default(formula = Surv(time, status) ~ featureNames, :
variable lengths differ (found for 'featureNames')
Can someone help me? Thanks!
I'm using R
version 3.6.3 on a pc running Linux Ubuntu 18.04.5 LTS/
Use reformulate, first set up a default formula:
Let's say you know before hand the features: