In SHAP TreeExplainer, when feature_pertrubation='tree_path_dependent' but data is not None, do we have 'interventional' shap?

688 Views Asked by At

So when feature_pertrubation='tree_path_dependent' the data argument is optional. If we give a background dataset, do we have the same behaviour as if feature_pertrubation='interventional?

From my minimal example that's what it seems like, at least for expected_value:

import shap
import numpy as np
from sklearn.tree import DecisionTreeRegressor

num_points = 500
num_samples = 100
num_features = 5
rng = np.random.default_rng(seed=1)
X = rng.normal(size=(num_points, num_features))
y = rng.integers(2, size=(num_points,))
X_sample = X[np.random.randint(X.shape[0], size=num_samples), :]

dt_model = DecisionTreeRegressor(max_depth=2).fit(X, y)
explainer1 = shap.TreeExplainer(dt_model, feature_perturbation='tree_path_dependent', model_output='raw')
explainer2 = shap.TreeExplainer(dt_model, feature_perturbation='tree_path_dependent', data=X_sample, model_output='raw')       
explainer3 = shap.TreeExplainer(dt_model, feature_perturbation='interventional', data=X_sample, model_output='raw')                                          
print(f'explainer1.expected_value = {explainer1.expected_value}')
print(f'explainer2.expected_value = {explainer2.expected_value}')
print(f'explainer3.expected_value = {explainer3.expected_value}')
explainer1.expected_value = [0.514]
explainer2.expected_value = 0.5139024767801856
explainer3.expected_value = 0.5139024767801856
0

There are 0 best solutions below