Comparision of variable importance plots in the randomForest and gbm package

35 Views Asked by At

i have a question concerning the comparision of variable importance plots in the randomForest and gbm package:

I am comparing a random forest and gbm classifier in the context of binary classification.

I implemented the rf with the randomForest package (by Liaw ) and the gbm with the gbm package (by Greenwell; the distribution specified is "bernoulli")

Unfortunatly, i find the description in the documentation quite blurry / do not understand it fully:

My core question is: which methods are used, when we plot the variable importance via varImpPlot for random forest and summary for GBM and are they comparable?

For the random forest, according to the documentation there are two methods: permuting OOB data and total decrease in node impurities from splitting on the variable

For the GBM, summary gives out the relative influence. Here the documentation states: "... For other loss functions [in my case bernoulli] this returns the reduction attributable to each variable in sum of squared error in predicting the gradient on each iteration."

Is there a way to consitently compare the variable importance for RF and GBM?

I havent coded anything, this is a rather theoretical question.

0

There are 0 best solutions below