How to use tensorflow dataset to fit a tf.keras model? why do I need to batch?

21 Views Asked by At

When following the recommended way to create a dataset from numpy arrays via

dataset = tf.data.Dataset.from_tensor_slices((xdata, ydata))

the way to fit the model

model.fit(dataset)

creates the following error

ValueError: Input 0 of layer "sequential" is incompatible with the layer: expected shape=(None, 3, 2), found shape=(3, 2)

the dataset is created via

   xdata = np.zeros((observations, datapoints, features), dtype="float16")
    ydata = np.zeros((observations, classes), dtype="float16")
    for i in range(observations):
        xdata[i, :, :] = np.random.rand(datapoints, features)
        ydata[i, :] = np.random.rand(classes)

for a model with input layer

model = tf.keras.Sequential()
model.add(tf.keras.layers.Conv1D(64, 3, 3, input_shape = (datapoints,features,)))

with

datapoints=3
features=2

So the input is 2 dimensional and xdata and ydata are packed into one dataset.

for executable code see kaggle

To boil the issue down to the simplest toy problem still producing the error see kaggle)

The internet research was not very revealing. Unfortunately all the official examples for fitting use two separate inputs.

For earlier versions of TF, there was the now deprecated .make_one_shot_iterator(). Now it is possible to direclty iterate through a dataset. Unfortunately, this seems to not work for the fit method.

Interestingly, the official Keras documentation for the fit method states

x: Input data. could be:

A tf.data.Dataset. Should return a tuple of either (inputs, targets) or (inputs, targets, sample_weights).

The same error is raised for tf.keras and for keras.

I found the solution here: keras.Model.compute_loss

Now the question: Why is batch() needed in order for the fit call to work?

0

There are 0 best solutions below