Warning: The calling iterator did not fully read the dataset being cached. In order to avoid unexpected truncation of the dataset

9.5k Views Asked by Nicolas Gervais - Open to Work At 17 August 2025 at 04:53

This occurred while I was using tf.data.Dataset:

The calling iterator did not fully read the dataset being cached. In order to avoid unexpected truncation of the dataset, the partially cached contents of the dataset will be discarded. This can happen if you have an input pipeline similar to dataset.cache().take(k).repeat(). You should use dataset.take(k).cache().repeat() instead.

According to other questions, for example this one, it has something to do with where cache() is in the sequence of methods, but I can't understand what to do concretely.

Here's how to reproduce the warning:

import tensorflow_datasets as tfds

ds = tfds.load('iris', split='train')

ds = ds.take(100)

for elem in ds:
    pass

Seems like no matter what I do, and no matter where I use cache(), the warning pops up.

Original Q&A

There are 1 best solutions below

AudioBubble On 09 November 2020 at 16:11

I tried to run your code on Google colab, it ran successfully without giving any warning, I'm using Tensorflow 2.3.

However, you can follow this general method while using cache.

If the dataset is small enough to fit in memory, you can significantly speed up training by using the dataset’s cache() method to cache its content to RAM. You should generally do this after loading and preprocessing the data, but before shuffling, repeating, batching, and prefetching. This way, each instance will only be read and preprocessed once (instead of once per epoch)

Warning: The calling iterator did not fully read the dataset being cached. In order to avoid unexpected truncation of the dataset

There are 1 best solutions below

Related Questions in PYTHON

Related Questions in TENSORFLOW

Related Questions in TENSORFLOW-DATASETS

Trending Questions

Popular # Hahtags

Popular Questions