AttributeError: 'SimpleImputer' object has no attribute '_validate_data' in PyCaret

16.4k Views Asked by At

I am using PyCaret and get an error.

AttributeError: 'SimpleImputer' object has no attribute '_validate_data'

Trying to create a basic instance.

# Create a basic PyCaret instance
import pycaret
from pycaret.regression import *
mlb_pycaret = setup(data = pycaret_df, target = 'pts', train_size = 0.8, numeric_features = ['home', 
'first_time_pitcher'], session_id = 123)

All my variables are numeric (I coerced two of them, which are boolean). My target variable is label and this is by default.

I also installed PyCaret, imported its regression, and re-installed scikit learn, imported SimpleImputer as from sklearn.impute import SimpleImputer

OBP_avg Numeric
SLG_avg Numeric
SB_avg  Numeric
RBI_avg Numeric
R_avg   Numeric
home    Numeric
first_time_pitcher  Numeric
park_ratio_OBP  Numeric
park_ratio_SLG  Numeric
SO_avg_p    Numeric
pts_500_parkadj_p   Numeric
pts_500_parkadj Numeric
SLG_avg_parkadj Numeric
OPS_avg_parkadj Numeric
SLG_avg_parkadj_p   Numeric
OPS_avg_parkadj_p   Numeric
pts_BxP Numeric
SLG_BxP Numeric
OPS_BxP Numeric
whip_SO_BxP Numeric
whip_SO_B   Numeric
whip_SO_B_parkadj   Numeric
order   Numeric
ops x pts_500 order15   Numeric
ops x pts_500 parkadj   Numeric
ops23 x pts_500 Numeric
ops x pts_500 orderadj  Numeric
whip_p  Numeric
whip_SO_p   Numeric
whip_SO_parkadj_p   Numeric
whip_parkadj_p  Numeric
pts Label

My traceback is the following:

4

There are 4 best solutions below

2
On BEST ANSWER

The problem here is with the imputation. The default per pycaret documentation is 'simple' but in this case, you need to make that imputation_type='iterative' for it to work.

1
On

It's incompatibility of library, install pycaret again with: pip install pycaret pandas shap

1
On

Good day all. What helped me is installing pycaret=='2.3.10 ' and scikit-learn='0.23.2' at the same time. These two version are compatible and all works fine. I installed scikit-learn using conda as the older versions are not available through pip, and I installed Pycaret using pip3. I hope this helps all who have struggled to get this working like I did.

0
On

Here is what worked for me on this error:

go to line 568 in your base file here: C:\Users\Eric.conda\envs\AUTOGLUON\lib\site-packages\sklearn\impute_base.py then search for the following line of code:

"if self.strategy == "constant" or self.keep_empty_features:"

Perform the following change, then save the file:

Change this:

    if self.strategy == "constant" or self.keep_empty_features:
        valid_statistics = statistics
        valid_statistics_indexes = None

To this:

    if self.strategy == "constant" or (hasattr(self, 'keep_empty_features') and self.keep_empty_features):
        valid_statistics = statistics
        valid_statistics_indexes = None

Save changes. Then, restart the Python kernel for the notebook, and run the code again. It should now work.... Or at least I hope it does for you