Should I include group column in data to use ctree in r?

74 Views Asked by At

I have data like below:

structure(list(`h:23705` = c(7.16421907753984, 7.18756733158759, 
6.71825354529678, 7.06582535720175), `h:9076` = c(3.63561443591981, 
8.80110411390239, 3.42736295167031, 6.82567063382749), `h:6430` = c(11.6493510134213, 
10.9021882366427, 11.5419097543467, 11.0172875218627), `h:23286` = c(5.95273659011779, 
7.02279022342779, 7.4670764443101, 8.00950334000673), `h:1910` = c(9.21020787636299, 
8.23315954971811, 9.02514643319415, 7.37180226446582), `h:100506658` = c(6.17503716113031, 
5.33320510677162, 5.3854863067867, 4.85548061452834), `h:7278` = c(6.68267555640564, 
6.72703399259055, 6.49979264556028, 6.69736758893795), `h:8321` = c(7.48693243798371, 
9.20705588080534, 8.29611998959537, 8.62260573459778), `h:23127` = c(9.22906929860114, 
8.38044548795266, 8.6686217703452, 7.00315009754005), `h:3480` = c(6.26913873735372, 
7.18693171118497, 7.02427323899392, 7.35918725178567), group = c("c", 
"d", "c", "d")), row.names = c("G1", "G2", "G3", 
"G4"), class = "data.frame")

I am trying to use ctree method from party package in r. I use below code:

model<- ctree(group ~ ., train_data)

It throws below error:

Error in trafo(data = data, numeric_trafo = numeric_trafo, factor_trafo = factor_trafo,  : 
  data class “character” is not supported
In addition: Warning message:
In storage.mode(RET@predict_trafo) <- "double" : NAs introduced by coercion

I am not sure what is the source of this problem?

1

There are 1 best solutions below

0
On

The group variable needs to be a factor not a character variable. It is recommended to declare this explicitly in the training data, e.g.,

train_data$group <- factor(train_data$group)

(or alternatively using transform() or mutate() etc.).