Building a tf.data.Dataset object with images as features and .csv files as labels

615 Views Asked by At

I'm working on a deep learning project trying to automatically detect the joints of people given an image, and I am stuck trying to feed the data in the correct format to my neural network. My features (x) are very large images (2000x900) and my labels (y) are .csv files that have four columns of 15 rows each: the first column contains strings (the name of the joint) and the other three columns contain integers (one integer per row per column). The .csv files look like this:

Left_knee, vis, x, y

Right_knee, vis, x, y

... (the x and y here represent image coordinates, not features and labels! vis is 0 or 1, indicating whether the joint was visible)

Each .csv file corresponds to a specific image, and the .csv files and corresponding images have the same name but different paths. Now, I want to create a tf.data.Dataset object where the features are the images, and the labels are Python dictionaries built from the .csv files. So for example, a single label y(i) corresponding to an image x(i) would need to look like this: {'Left_knee': [vis, x, y], 'Right_knee': [vis, x, y], ...}.

My strategy for constructing such a Dataset was to load the images and the labels into separate tf.data.Dataset objects and then fuse them together. For loading the images, I wrote this very barebones (and perhaps inefficient/wrong?) code:

imgs_path = pathlib.Path('path/to/images')
list_imgs = tf.data.Dataset.list_files(str(imgs_path/'*'))

def imgs_to_dataset(file_path):
    return tf.io.read_file(file_path)

imgs_dataset = list_imgs.map(imgs_to_dataset)

To be honest I'm too new to TensorFlow (and programming in general!) to test it in any way to spot potential issues, but it doesn't give me any errors at least.

Now, my problem is how to load the .csv files into a tf.data.Dataset object, and then fuse it with imgs_dataset so that the right labels go to each image. I understand I have to use something like tf.data.experimental.make_csv_dataset, but I'm not quite sure how to set it up so that my y is then in the format that I want. Is there a way to do this, or am I going the wrong way about it? I should clarify that I have no attachment to the idea of using a tf.data.Dataset object, but from the little that I know it seems like a very convenient (if you can set it up!) way to feed data to a tf.keras model through .fit(). Also, I want the labels to be structured in that specific way (i.e. as dictionaries) because my network's loss will need to access the different fields of the labels (for instance, the loss will be lower if, for a given image and a given joint, the joint's vis parameter is 0). But maybe there's a more efficient way to structure my labels to achieve this goal?

Any help and suggestions will be greatly appreciated! Thanks in advance.

1

There are 1 best solutions below

1
On

I think tf.data.Dataset is a good approach. You do not need to create two Datasets and fuse them together, you can first load your dataset using eg. from_tensor_slces, list_files, or from_generator and then proceed to apply map functions, which post-process your images if needed and load a post-process your labels too, i.e. transform them from pd.DataFrame to dict. Your map function would then return tuples of images as tensors with their labels as dicts. After map, you can apply shuffling and batching as well.

Example from the docs:

dataset = Dataset.range(5)
# `map_func` takes a single argument of type `tf.Tensor` with the same
# shape and dtype.
result = dataset.map(lambda x: x + 1) 

The way map should be structured highly depends on the structure of your files and naming. You will probably want to replace the lambda function with your own custom one,