Random Forest and SHAP values with few features for features selection

110 Views Asked by Pachita At 04 May 2023 at 17:09

I have some datasets with 4 features and observations between 100 and 300. I would like to use them to perform a classification. The target variable has 3 possible labels. I have trained a Random Forest and as the interpretation and understanding of the result and the feature selection step are more important than the result itself, I have also calculated SHAP values.

I applied a cluster analysis and identified three clusters in the data. The data set also has other features, but I performed the cluster analysis considering only two numerical features. It is important that only these two features are considered because they lead to a result that can be highly understood by the users of the results of this analysis. Now I want to figure out why these three classes exist. I have therefore fitted a random forest, considering that the class obtained with the cluster analysis is the dependent variable, while the remaining features are the independent variables. By looking at the predictive ability of the random forest and the SHAP values, I can explain which variables are important in predicting the class, and thus how come the three classes exist.

Do you think this approach can be reasonable, or is the model too simple for such an advanced XAI? Should I use a different model or a different approach to explain the model and to select the most important features?

Original Q&A

Random Forest and SHAP values with few features for features selection

There are 0 best solutions below

Related Questions in RANDOM-FOREST

Related Questions in FEATURE-SELECTION

Related Questions in SHAP

Related Questions in SHAPLEY

Trending Questions

Popular # Hahtags

Popular Questions