loading parquet using petastorm to use it like tf.data.dataset

137 Views Asked by At

I've been trying to implement petastorm to load parquet straight into tensorflow. However I do not understand what is going on... After reading parquet file like:

with make_batch_reader(dataset_url_or_urls=filepath) as reader:
    dataset = make_petastorm_dataset(reader)

it returns tensorflow.python.data.ops.dataset_ops.DatasetV1Adapter then when I try to iterate over it I get following error

UnknownError: RuntimeError: Trying to read a sample after a reader created by make_reader/make_batch_reader has stopped. This may happen if the make_reader/make_batch_reader context manager has exited but you try to fetch a sample from it anyway Traceback (most recent call last):

How do I fix this and also how to get tf.data.dataset instead of DatasetV1Adapter which I believe is old verson of tf.data.dataset

0

There are 0 best solutions below