i want to know why the results are different? Thanks for the help!
mlr3
library(mlr3)
library(mlr3verse)
library(mlr3learners)
library(randomForest)
library(tidyverse)
library(tidymodels)
tasks = as_task_classif(iris, target = 'Species')
learners = lrn("classif.randomForest", predict_type =
"prob",importance= c('gini'))
set.seed(123, kind = "Mersenne-Twister")
mlr3_result = learners$train(tasks)
mlr3_result$model
Call:
randomForest(formula = formula, data = data, classwt = classwt, cutoff = cutoff, importance = TRUE)
Type of random forest: classification
Number of trees: 500
No. of variables tried at each split: 2
OOB estimate of error rate: 4%
Confusion matrix:
setosa versicolor virginica class.error
setosa 50 0 0 0.00
versicolor 0 47 3 0.06
virginica 0 3 47 0.06
a$model$importance
setosa versicolor virginica MeanDecreaseAccuracy MeanDecreaseGini
Petal.Length 0.335345171 0.307701242 0.294076163 0.310261341 43.088229
Petal.Width 0.330789167 0.311838647 0.273864031 0.303464596 44.258135
Sepal.Length 0.037133413 0.020715911 0.042752581 0.034013485 9.715093
Sepal.Width 0.008714192 0.004354224 0.008025792 0.006962512 2.238209
randomForest
set.seed(123, kind = "Mersenne-Twister")
randomForest_result <- randomForest(iris[,1:4],
iris$Species,
importance = TRUE)
randomForest_result
Call:
randomForest(x = iris[1:4], y = iris$Species, importance = TRUE)
Type of random forest: classification
Number of trees: 500
No. of variables tried at each split: 2
OOB estimate of error rate: 4.67%
Confusion matrix:
setosa versicolor virginica class.error
setosa 50 0 0 0.00
versicolor 0 47 3 0.06
virginica 0 4 46 0.08
randomForest_result[["importance"]]
setosa versicolor virginica MeanDecreaseAccuracy MeanDecreaseGini
Sepal.Length 0.036131008 0.023774906 0.038354330 0.033578769 9.798189
Sepal.Width 0.008306837 0.001895114 0.007878582 0.006106488 2.236535
Petal.Length 0.328732260 0.308242048 0.293766472 0.307467294 43.093269
Petal.Width 0.337599283 0.315079112 0.267375505 0.303321451 44.042372
tidymodels
library(tidyverse)
library(tidymodels)
set.seed(123, kind = "Mersenne-Twister")
tidy_results = rand_forest(mode = "classification",) %>%
set_engine("randomForest",importance = T) %>%
fit(
Species ~.,
data = iris
)
tidy_results
Call:
randomForest(x = maybe_data_frame(x), y = y, importance = ~T)
Type of random forest: classification
Number of trees: 500
No. of variables tried at each split: 2
OOB estimate of error rate: 4.67%
Confusion matrix:
setosa versicolor virginica class.error
setosa 50 0 0 0.00
versicolor 0 47 3 0.06
virginica 0 4 46 0.08
tidy_results[["fit"]][["importance"]]
setosa versicolor virginica MeanDecreaseAccuracy MeanDecreaseGini
Sepal.Length 0.036131008 0.023774906 0.038354330 0.033578769 9.798189
Sepal.Width 0.008306837 0.001895114 0.007878582 0.006106488 2.236535
Petal.Length 0.328732260 0.308242048 0.293766472 0.307467294 43.093269
Petal.Width 0.337599283 0.315079112 0.267375505 0.303321451 44.042372
tidymodels and randomForest have the same results, but mlr3 and randomForest not, setting the seed does not yield the same results! I make some mistakes in code? I feel confused...
randomForest:4.7.1.1 mlr3:0.17.0 tidymodels:1.1.1