MLOps with TFX: How to ingest data when using Sequence from Keras?

119 Views Asked by Alexander Martins At 25 June 2025 at 02:39

I am using a class called DataGenerator, that returns a tuple (data_array, label_array), follows the code:

from tensorflow.keras.utils import Sequence

class DataGenerator(Sequence):
    """
    path_data: the path of the csv files
    """
...

This class consumes from a list of .csv files, as shown in the following image:

Each file contains a column like this:

0.44
0.45
0.42
0.22
0.05
0.05
0.05
0.05
0.11
0.11
0.05
0.05
0.05
0.05
0.05
0.05

But these files are very huge and each one represents the data of each instance.

The problem is that I don't understand how to ingest the data through the tfx.v1.components.CsvExampleGen to use it inside the tfx pipeline...

Is it possible to ingest the data using tfx or should I look at another alternative?
Can I use CsvExampleGen to ingest from a bunch of files in a directory?

Original Q&A

There are 2 best solutions below

AudioBubble On 27 December 2022 at 10:52

Data ingestion which consists of reading data from raw format and formatting it into a binary format suitable for ML (e.g. TFRecord). TFX provides a standard component called ExampleGen which is responsible for generating training examples from different data sources.

tfx.v1.components.CsvExampleGen component takes input_base args which expects an external directory containing the CSV files. You can even customize the input and output train/eval split ratio for ExampleGen as shown here.

Pritam Dodeja On 07 February 2023 at 05:44

Are you saying you have five features, and that initially their shapes are (None, 1), and you need them to be a higher-dimensional feature of shape (None, 1, 5) when you are done? In my mind, this is doable with tfx, you would need to concatenate your data in the Transform component using the right axis after reading with CsvExampleGen. If you could clarify how DataGenerator gets the data, maybe there is a simpler solution.

MLOps with TFX: How to ingest data when using Sequence from Keras?

There are 2 best solutions below

Related Questions in PYTHON

Related Questions in TENSORFLOW

Related Questions in KERAS

Related Questions in TFX

Related Questions in TENSORFLOW-EXTENDED

Trending Questions

Popular # Hahtags

Popular Questions