What would be best practice for placing pre-processing and augmentation of images in a TFX pipeline?

222 Views Asked by physicist At 15 May 2025 at 18:40

I have a semantic segmentation deep learning model which i want to deploy on kubeflow using TFX.

As I am moving the standalone DL code to TFX components I was having some questions

The input images and masks will be stored in a tf-record. Would it be good practice to do the pre-processing like cropping, resizing, combining mask and image to ground truth, before the TFX pipeline starts (i.e ExampleGen)?
Alternatively, would it be better practice to store the raw images and maskes in tf-record and then do pre-processing in Transform component of TFX?
I also have some code for data augmentation during training. Would it be better to apply the augmentations in the trainer component or the transform component of TFX?

I would highly appreciate any pro tips or cautions to look out for in general!

There are 0 best solutions below