tensorflow cnn error: InvalidArgumentError (see above for traceback): logits and labels must be same size

1.7k Views Asked by At

I wrote this code by modified from tensorflow official tutorial. I had a network as following:

def weight_variable(shape):
    initial = tf.truncated_normal(shape, stddev=0.1)
    return tf.Variable(initial)

def bias_variable(shape):
    initial = tf.constant(0.1, shape=shape)
    return tf.Variable(initial)

def conv2d(x, w):
    return tf.nn.conv2d(x, w, strides=[1, 1, 1, 1], padding='SAME')

def max_pool_2x2(x):
    return tf.nn.max_pool(x, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME')

import tensorflow as tf
import numpy as np

train_feature = np.array(model.dataset[0])
train_label = np.array(model.labelset[0])
print(train_feature.shape)
print(train_label.shape)

x_placeholder = tf.placeholder(tf.float32, shape=[None, train_feature.shape[1]])
y_placeholder = tf.placeholder(tf.float32, shape=[None, 8])
# network structure
w_conv1 = weight_variable([5, 5, 1, 32])
b_conv1 = bias_variable([32])
x_image = tf.reshape(x_placeholder, [-1, 160, 120, 1])
h_conv1 = tf.nn.relu(conv2d(x_image, w_conv1) + b_conv1)
h_pool1 = max_pool_2x2(h_conv1)

w_conv2 = weight_variable([5, 5, 32, 64])
b_conv2 = bias_variable([64])

h_conv2 = tf.nn.relu(conv2d(h_pool1, w_conv2) + b_conv2)
h_pool2 = max_pool_2x2(h_conv2)

w_fc1 = weight_variable([320, 1024])
b_fc1 = bias_variable([1024])

h_pool2_flat = tf.reshape(h_pool2, [-1, 320])
h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, w_fc1) + b_fc1)

keep_prob = tf.placeholder(tf.float32)
h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob)

w_fc2 = weight_variable([1024, 8])
b_fc2 = bias_variable([8])

y_conv = tf.matmul(h_fc1_drop, w_fc2) + b_fc2

cross_entropy = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(y_conv, y_placeholder))
train_step = tf.train.AdamOptimizer(1e-4).minimize(cross_entropy)
sess = tf.InteractiveSession()
sess.run(tf.global_variables_initializer())
batch = (train_feature, np.eye(8)[train_label])
train_step.run(feed_dict={x_placeholder: batch[0], y_placeholder: batch[1], keep_prob: 0.5})

Error is following:

cross_entropy = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(y_conv, y_placeholder))
File "/Users/lintseju/anaconda/lib/python3.5/site-packages/tensorflow/python/ops/nn_ops.py", line 1449, in softmax_cross_entropy_with_logits
precise_logits, labels, name=name)
File "/Users/lintseju/anaconda/lib/python3.5/site-packages/tensorflow/python/ops/gen_nn_ops.py", line 2265, in _softmax_cross_entropy_with_logits
features=features, labels=labels, name=name)
File "/Users/lintseju/anaconda/lib/python3.5/site-packages/tensorflow/python/framework/op_def_library.py", line 759, in apply_op
op_def=op_def)
File "/Users/lintseju/anaconda/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 2240, in create_op
original_op=self._default_original_op, op_def=op_def)
File "/Users/lintseju/anaconda/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 1128, in __init__
self._traceback = _extract_stack()

InvalidArgumentError (see above for traceback): logits and labels must be same size: logits_size=[2400,8] labels_size=[10,8]
 [[Node: SoftmaxCrossEntropyWithLogits = SoftmaxCrossEntropyWithLogits[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/cpu:0"](Reshape_2, Reshape_3)]]

train_feature is (10, 19200) numpy array, and train label is (10,) numpy array. Anyone know why logits_size=[2400,8] ?

1

There are 1 best solutions below

0
On

Before passing an image (either it is the network's input or some intermediate output of a convolutional (CONV) layer) to a fully-connected (FC) layer, you have to make sure that (1) you properly reshape the image into a 1D vector as well as (2) set the dimensions of the weights of the network to agree with the 1D vector. In your case, the switching to FC layers happens after the pooling of the second layer. While you make sure that h_pool2_flat have a compatible dimension with the weights w_fc1 of the first FC layer, you are not setting the flat dimension correctly. The hard-coded value 320 is not the correct dimension in this case. And trying to hard-code this may not be the best practice in general and the code may keep breaking whenever you introduce modifications to the size of the input or the convolutional stack of the network (e.g. by adding/removing pooling layers or adjusting the strides of some layers).

Instead, you should add some code to compute the flat dimension automatically and use the computed value to setup the dimensions as in the following example:

# Here happens conversion from 2/3D images to 1D vectors.
h_pool2_shape = h_pool2.get_shape()
# Don't hard-code the 1D vector dim. Rather, (1) multiply image's height,
# width and depth to get it.
h_pool2_dim = h_pool2_shape[1] * h_pool2_shape[2] * h_pool2_shape[3]
# (2) Use the computed 1D dimension to set the FC1 weight matrix dimensions.
w_fc1 = weight_variable(tf.stack([h_pool2_dim, 1024]))
b_fc1 = bias_variable([1024])
# (3) Use the same 1D dimension to correctly reshape the batch matrix.
h_pool2_flat = tf.reshape(h_pool2, tf.stack([-1, h_pool2_dim]))
h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, w_fc1) + b_fc1)