I want to feed a numpy array into a CNN, that contains 2 chess positions, one before a move and the second after a certain move. I want to train the CNN to estimate the evaluation of that move by a conventional chess program. These evaluations are int values.
The shapes of the x
and y
are: x: (2000000, 8, 8, 2) , y: (2000000,)
The model code:
#define model
model = Sequential()
#model.add(Dense(1024, activation='relu', input_dim=864))
model.add(Conv2D(128, kernel_size=(3, 3), strides=(1, 1), activation='relu', input_shape=(8,8,2)))
model.add(Conv2D(128, kernel_size=(3, 3), strides=(1, 1), activation='relu'))
model.add(Dense(128, activation='relu', init='uniform'))
model.add(BatchNormalization())
model.add(Dense(1))
model.compile(loss='mean_squared_error', optimizer='adam',metrics=['mae'])
print(model.summary())
The training is done with:
history = model.fit(x, y, validation_split=0.1, epochs=5, batch_size=20000, verbose=2)
It gives me the following error:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-7-619de3f1be1b> in <module>()
171 for i in range(5):
172 print("Fitting begins", x.shape, y.shape)
--> 173 history = model.fit(x, y, validation_split=0.1, epochs=5, batch_size=20000, verbose=2)
174 #score = model.evaluate(x, y, verbose=2)
175 #print(score)
/usr/local/lib/python3.6/site-packages/keras/engine/training.py in fit(self, x, y, batch_size, epochs, verbose, callbacks, validation_split, validation_data, shuffle, class_weight, sample_weight, initial_epoch, steps_per_epoch, validation_steps, **kwargs)
950 sample_weight=sample_weight,
951 class_weight=class_weight,
--> 952 batch_size=batch_size)
953 # Prepare validation data.
954 do_validation = False
/usr/local/lib/python3.6/site-packages/keras/engine/training.py in _standardize_user_data(self, x, y, sample_weight, class_weight, check_array_lengths, batch_size)
787 feed_output_shapes,
788 check_batch_axis=False, # Don't enforce the batch size.
--> 789 exception_prefix='target')
790
791 # Generate sample-wise weight values given the `sample_weight` and
/usr/local/lib/python3.6/site-packages/keras/engine/training_utils.py in standardize_input_data(data, names, shapes, check_batch_axis, exception_prefix)
126 ': expected ' + names[i] + ' to have ' +
127 str(len(shape)) + ' dimensions, but got array '
--> 128 'with shape ' + str(data_shape))
129 if not check_batch_axis:
130 data_shape = data_shape[1:]
ValueError: Error when checking target: expected dense_10 to have 4 dimensions, but got array with shape (2000000, 1)
What do I do wrong? How can I fix this?
OK, I realized that the problem is related to the output shape of the last layer:
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d_13 (Conv2D) (None, 6, 6, 128) 2432
_________________________________________________________________
conv2d_14 (Conv2D) (None, 4, 4, 128) 147584
_________________________________________________________________
dense_14 (Dense) (None, 4, 4, 128) 16512
_________________________________________________________________
batch_normalization_7 (Batch (None, 4, 4, 128) 512
_________________________________________________________________
dense_15 (Dense) (None, 4, 4, 1) 129
=================================================================
But why is it (None, 4, 4, 1)
? Shouldn't it be (None, 1)
? It's a single neuron with 1 value!
No, it shouldn't. Because the Dense layer is applied on the last axis of its input, and therefore since in this case it is applied on the output of
Conv2D
layer, which is is a 4D tensor, the output of theDense
layer would also be 4D tensor. To resolve this, you can first flatten the output ofConv2D
layer using aFlatten
layer and then use theDense
layer, like this: