How would you go about using shap or lime or any other model interpretability tools with a TPOT exported pipeline? For example, here is some code for shap library, but you cannot pass the TPOT pipeline in to it. What would you pass in there instead?
explainer = shap.Explainer(model)
shap_values = explainer(X)
Solution 1:
To use SHAP to explain scikit-learn Pipelines, the resulting model object of a TPOT optimization process, you need to instruct SHAP to use the Pipeline named final estimator (classifier/regressor step) and you need to transform your data with any Pipeline transformer step (i.e: pre-processor or feature selector) before feeding it to SHAP explainer.
Solution 2:
Apparentely scikit-learn Pipeline
predict_proba()function will do what has just been described in Solution 1 (i.e: Transform the data, and apply predict_proba with the final estimator.).In this sense, this should also work for you:
Additional Remarks
You can use
TreeExplainerwhich is must faster than the genericKernelExplainerif you use a tree-based model. As per the documentation, LightGBM, CatBoost, Pyspark and most tree-based scikit-learn models are supported.