I am using the rpart library in R to create a series of classification trees with different choices of maximum tree depth. My goal is to compare the error rates of trees constructed using these different depths.
Right now, my code is separated into two parts. In the first part, I use a for loop to grow the trees to a number of different depths:
for(i in 1:12) {
nam <- paste("tree", i, sep="_")
assign(nam, rpart(y ~ ., data = A, method="class", control=rpart.control(maxdepth=i, cp=0, minbucket = 1, minsplit = 2)))
}
But then, I have to calculate the error rates for each of these individual trees using the test data:
yhat_test_1 <- predict(tree_1, newdata = B, type = c("class"))
test_error_1 <- mean(yhat_test_1!=B[,1])
This requires a huge volume of code if I want to compare the error rates of a large number of trees. Is there a way to simplify this process so I can test the error rates of a large number of different trees at once?