I am using a simple net for classification but the output of the predict is one-hot encoded. I checked inside the loss function and the actual output is correct.
ip = Input(shape=(padding_size, n_features))
x = LSTM(50, activation='relu')(ip)
x = Dropout(0.2)(x)
out = Dense(3, activation='softmax')(x)
model = Model(ip, out)
I am using a custom loss function. The first three items of y_true are the one-hot encoding of the label, the remaining three are values used inside the function.
def odds_loss(y_true, y_pred):
print(y_pred) #see below
print(y_true) #see below
true_1 = y_true[:, 0:1]
true_2 = y_true[:, 1:2]
true_3 = y_true[:, 2:3]
feature_a = y_true[:, 3:4]
feature_b = y_true[:, 4:5]
feature_c = y_true[:, 5:6]
vector = K.concatenate([
true_1 * (feature_a - 0.3),
true_2 * (feature_b - 3.3) + K.log(feature_b - 1),
true_3 * (feature_c - 2.8)
], axis = 1)
return -1 * K.mean(K.sum(vector * y_pred, axis=1))
The output of those print, for a batch of four elements is:
tf.Tensor(
[[0.33180252 0.3375508 0.3306467 ]
[0.3335974 0.3334575 0.33294514]
[0.33349878 0.333613 0.33288828]
[0.3354906 0.3324541 0.33205533]], shape=(4, 3),
dtype=float32)
tf.Tensor(
[[1. 0. 0. 1.467 3.252 5.312]
[0. 1. 0. 1.251 2.851 4.665]
[0. 1. 0. 1.642 2.319 5.133]
[0. 0. 1. 1.635 1.996 3.985]], shape=(4, 3),
dtype=float32)
As expected, at this stage it's just random guessing. However, if I execute the predict with some test data I only get one hot encoded arrays, all the same:
y_pred = model.predict(X_test)
print(y_pred)
[[0. 0. 1.]
[0. 0. 1.]
[0. 0. 1.]
[0. 0. 1.]
[0. 0. 1.]
[0. 0. 1.]
[0. 0. 1.]
[0. 0. 1.]
[0. 0. 1.]
[0. 0. 1.]
[0. 0. 1.]
[0. 0. 1.]
[0. 0. 1.]
[0. 0. 1.]
[0. 0. 1.]
[0. 0. 1.]
[0. 0. 1.]
[0. 0. 1.]
[0. 0. 1.]
[0. 0. 1.]]
I manually checked the operations inside the loss function, the returned value is as expected. Why am I not getting the same output as inside the loss function?