I'm trying to implement batch hard triplet loss, as seen in Section 3.2 of https://arxiv.org/pdf/2004.06271.pdf.
I need to import my images so that each batch has exactly K instances of each ID in a particular batch. Therefore, each batch must be a multiple of K.
I have a directory of images too large to fit into memory and therefore I am using ImageDataGenerator.flow_from_directory()
to import the images, but I can't see any parameters for this function to allow the functionality I need.
How can I achieve this batch behaviour using Keras?
You can try merging several data streams together in a controlled manner.
Given you have K instances of
tf.data.Dataset
(does not matter how you instantiate them) that are responsible for supplying training instances of particular IDs, you can concatenate them to get even distribution inside a mini-batch:where the
concat_datasets
is the merge function: