I used eli5 to apply the permutation procedure for feature importance. In the documentation, there is some explanation and a small example but it is not clear.
I am using a sklearn SVC model for a classification problem.
My question is: Are these weights the change (decrease/increase) of the accuracy when the specific feature is shuffled OR is it the SVC weights of these features?
In this medium article, the author states that these values show the reduction in model performance by the reshuffle of that feature. But not sure if that's indeed the case.
Small example:
from sklearn import datasets
import eli5
from eli5.sklearn import PermutationImportance
from sklearn.svm import SVC, SVR
# import some data to play with
iris = datasets.load_iris()
X = iris.data[:, :2]
y = iris.target
clf = SVC(kernel='linear')
perms = PermutationImportance(clf, n_iter=1000, cv=10, scoring='accuracy').fit(X, y)
print(perms.feature_importances_)
print(perms.feature_importances_std_)
[0.38117333 0.16214 ]
[0.1349115 0.11182505]
eli5.show_weights(perms)

I did some deep research. After going through the source code here is what I believe for the case where
cvis used and is notprefitorNone. I use a K-Folds scheme for my application. I also use a SVC model thus,scoreis the accuracy in this case.By looking at the
fitmethod of thePermutationImportanceobject, the_cv_scores_importancesare computed (https://github.com/TeamHG-Memex/eli5/blob/master/eli5/sklearn/permutation_importance.py#L202). The specified cross-validation scheme is used and thebase_scores, feature_importancesare returned using the test data (function:_get_score_importancesinside_cv_scores_importances).By looking at
get_score_importancesfunction (https://github.com/TeamHG-Memex/eli5/blob/master/eli5/permutation_importance.py#L55), we can see thatbase_scoreis the score on the non shuffled data andfeature_importances(called differently there as:scores_decreases) are defined as non shuffled score - shuffled score (see https://github.com/TeamHG-Memex/eli5/blob/master/eli5/permutation_importance.py#L93)Finally, the errors (
feature_importances_std_) are the SD of the abovefeature_importances(https://github.com/TeamHG-Memex/eli5/blob/master/eli5/sklearn/permutation_importance.py#L209) and thefeature_importances_is the mean of the abovefeature_importances(non-shuffled score minus (-) shuffled score).