Which kedro dataset should be used when working with images and keras ImageDataGenerator? I know there is ImageDataset but the number of images is too large to fit in memory. And all that keras ImageDataGenerator really needs is a local folder location to the image dataset in the form of:
data/
train/
dogs/
dog001.jpg
dog002.jpg
...
cats/
cat001.jpg
cat002.jpg
...
validation/
dogs/
dog001.jpg
dog002.jpg
...
cats/
cat001.jpg
cat002.jpg
...
It would be possible to use a parameter specifying the data location but I think the appropriate location for data should be the Data Catalog. Is there a simple way to specify this data location in the Data Catalog?
How about setting the path in
parameters.ymland then read that as an input to your ImageDataGenerator. It could look something like:Modify the above example based on what is best. You can also consider setting a global path for all datasets in the
conf/base/globals.ymlfile. For example, for your root data folder.