I am trying to use auto sklearn for some pandas data, and when i run:
model.fit(X_train, y_train)
this error pops up:
ValueError Traceback (most recent call last)
<ipython-input-10-ed5cd6b32087> in <module>
2 # sklearn.model_selection.train_test_split(X, y, random_state=1)
3
----> 4 model.fit(X_train, y_train)
~/notebook/jupyterenv/lib/python3.6/site-packages/autosklearn/estimators.py in fit(self, X, y, X_test, y_test, feat_type, dataset_name)
660 "".format(
661 target_type,
--> 662 supported_types
663 )
664 )
ValueError: Classification with data of type continuous is not supported. Supported types are ['binary', 'multiclass', 'multilabel-indicator']. You can find more information about scikit-learn data types in: https://scikit-learn.org/stable/modules/multiclass.html
my (X,y) data looks something like this: (the headers HOMO/LUMO etc. are descriptors)
HOMO (A) HOMO (AH) LUMO (A) LUMO (AH) charge (AH) Charge metal (A) \
0 -7.8453 -9.6920 -4.2406 -6.9161 1 -0.938
1 -7.7330 -9.6774 -4.0690 -6.9602 1 -0.911
2 -7.6751 -9.6051 -3.9238 -6.8990 1 -0.950
3 -8.1345 -9.8027 -6.3221 -7.5155 1 -0.868
4 -7.9405 -9.4709 -5.7324 -6.9515 1 -0.880
.. ... ... ... ... ... ...
164 -7.5867 -9.7576 -5.1992 -6.8152 1 -0.312
165 -8.3700 -10.1670 -6.6819 -7.8044 1 -0.311
166 -8.3445 -10.0288 -6.6499 -7.5991 1 -0.321
167 -7.9764 -10.0586 -6.3554 -7.5688 1 -0.277
168 -7.9317 -9.9008 -6.3104 -7.3790 1 -0.288
[169 rows x 17 columns] [24.4 23.8 24. 14.2 22.5 18.5 19.4 17.4 22.6 16.3 20.3 13.2 16.5 21.2
24.6 17.3 23.3 22.2 18. 31.1 29.7 30.4 22. 23.2 22.1 27.6 22.9 19.8
18.3 18.5 44.8 39.4 46. 49.9 35. 22.5 32. 22.8 38.1 23.6 23.3 18.4
15.6 11.3 13.3 13.9 16.1 20.8 23. 20.4 8.3 11.3 11.4 15.1 15.4 17.1
18.7 21.1 26.6 23. 20.4 21.6 26.8 9. 11.4 32.7 -1.6 -0.3 -1.3 -0.4
-3.9 1. 5.6 0.5 0. 4.5 6.8 7.8 4.2 1.1 4.2 5.5 0.8 12.
17. 5.8 17. 26.1 27.2 31.9 0.5 1.5 8.5 7.1 25.5 40. -5.7 -6.
12.5 4.4 -5. -1.3 -5. -5. -5. -5. -0.6 -0.6 2. 3.6 3.2 0.1
2.1 4.5 11. 2.7 3.5 -2. 1.2 9.3 2.6 7.1 6.1 3.2 5.1 7.5
1.8 4.3 4.4 0.8 9.9 7.6 7.9 8.9 10. 10.9 11.8 9.9 13.4 13.4
8.8 2.1 6. 7.1 -1.1 0.5 0.3 4.7 6. 6.5 8. 11.6 6.9 8.4
8.7 7.2 6.3 6.4 7.4 12.1 10.4 11.1 12.2 14.3 16.3 8.1 8.5 8.6
9. ]
As the error explains, you're giving continuous variables to a model only handling binary or multiclass ones. What's the model ? You should check the doc to see how it works / what it handles or don't
-- FOLLOWING THE COMMENT
A continuous variable is a variable taking an infinite possibility (or really high) of numerous values (here is quite a good example since you have floats with 4 decimals, so it's obviously continuous). Hence, binary will be '1' or '0', and categorical would be a finite number of features (like 'January', 'February', ... , 'December', so only 12 possible categories). Many kind of models handle continuous variables (some ONLY want categorical variables), so if you don't have any constraint on your model, you can definitely switch to one of this kind.