I try to create neural network for categorical data in python (3.5)
.
I have a table with 47 independent variables (X), and table with 1 column of dependent variable (y). This variable is categorical and it is one of three possible options.
Because of this, I label it with LabelEncoder()
so that this variable is now 0
or 1
or 2
.
Then I put those numbers in three columns : with OneHotEncoder
, and delete last column. Why: Because combination of two 1
and 0
brings 3 possible outcomes.
For neural network I use softmax
on output layer and categorical_crossentropy
for loss function (this should be used for categorical data)
When I run my code, I get error:
classification.py in _check_targets(y_true=array([[ 1., 0., 0.],
[ 0., 1., 0.],
...
[ 0., 0., 1.],
[ 0., 1., 0.]]), y_pred=array([2, 2, 2, 2, 2]))
77 if y_type == set(["binary", "multiclass"]):
78 y_type = set(["multiclass"])
79
80 if len(y_type) > 1:
81 raise ValueError("Can't handle mix of {0} and {1}"
---> 82 "".format(type_true, type_pred))
type_true = 'multilabel-indicator'
type_pred = 'binary'
83
84 # We can't have more than one value on y_type => The set is no more needed
85 y_type = y_type.pop()
86
ValueError: Can't handle mix of multilabel-indicator and binary
I don't understand the error: type_true
-> is probably type of true data (real data that I have), and I can see that they are binary.
P.S.
If I remove two columns in y
instead of one (Then I have only one column left), and I use sigmoid
function with binary_crossentropy
loss function, I don't get any error. So the data preparation seems to be ok?
P.P.S
My code is like this:
# y is like [['first'], ['second'], ['third'],...]
labelencoder_y_1 = LabelEncoder()
y[:, 0] = labelencoder_y_1.fit_transform(y[:, 0])
onehotencoder_y = OneHotEncoder(categorical_features = [0])
y = onehotencoder_y.fit_transform(y).toarray()
# Splitting the dataset into the Training set and Test set
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2,
random_state = 0)
# Feature Scaling
from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)
# Tuning the ANN
from keras.wrappers.scikit_learn import KerasClassifier
from sklearn.model_selection import GridSearchCV
from keras.models import Sequential
from keras.layers import Dense
def build_classifier(optimizer, units, layers):
classifier = Sequential()
classifier.add(Dense(units = units, kernel_initializer = 'uniform', activation = 'relu', input_dim = 47))
for i in range(layers):
classifier.add(Dense(units = units, kernel_initializer = 'uniform', activation = 'relu'))
classifier.add(Dense(units = 3, kernel_initializer = 'uniform', activation = 'softmax'))
classifier.compile(optimizer = optimizer, loss = 'categorical_crossentropy', metrics = ['accuracy'])
return classifier
classifier = KerasClassifier(build_fn = build_classifier)
parameters = {'batch_size': [32],
'epochs': [64],
'optimizer': ['rmsprop'],
'units': [16],
'layers': [2]}
grid_serach = GridSearchCV(estimator = classifier,
param_grid = parameters,
scoring = 'accuracy',
cv = 10,
n_jobs = 3)
grid_serach = grid_serach.fit(X_train, y_train)
best_parameters = grid_serach.best_params_
best_accuracy = grid_serach.best_score_
EDIT: I edit my question to have all three columns because of comment from @djk47463