I tried to split the Cats_vs_dogs dataset with the split function but I cannot check if it worked. when I call the num_example on train_info and val_info I got the same number 23262. why?
dataset, info = tfds.load('cats_vs_dogs',
split='train',
shuffle_files=True,
as_supervised=True,
with_info=True
)
ds_train, train_info = tfds.load(
'cats_vs_dogs',
split='train[:80%]',
shuffle_files=True,
as_supervised=True,
with_info=True
)
ds_val, val_info= tfds.load(
'cats_vs_dogs',
split='train[-20%:]',
shuffle_files=True,
as_supervised=True,
with_info=True
)
print(train_info.splits['train'].num_examples)
print(train_info.splits['train'].num_shards)
print(val_info.splits['train'].num_examples)
print(val_info.splits['train'].num_shards)
I got 23262 examples from train_info, info, and val_info!
So to get the number of example from each split, we have to ask for train_info.splits['train[:80%]'].num_examples and val_info.splits['train[-20%:]'].num_examples
better for splitting the dataset, using tf.keras.preprocessing.image_dataset_from_directory worked better for me.