I have dataset for image captioning. Each image has different number of captions (or sentences), let say some images have seven captions and other may have ten or more. I used the following code for dataset creation:
def make_dataset(videos, captions):
dataset = tf.data.Dataset.from_tensor_slices((videos, tf.ragged.constant(captions)))
dataset = dataset.shuffle(BATCH_SIZE * 8)
dataset = dataset.map(process_input, num_parallel_calls=AUTOTUNE)
dataset = dataset.batch(BATCH_SIZE).prefetch(AUTOTUNE)
return dataset
this code is worked fine only when the BATCH_SIZE = 1
.
when I try to use BATCH_SIZE = 2
or more I get the following error:
InvalidArgumentError: Cannot add tensor to the batch: number of elements does not match. Shapes are: [tensor]: [7,20], [batch]: [10,20] [Op:IteratorGetNext]
Is there a way to merge these data in batches without using padding?