Set the name for each ParallelFor iteration in KFP v2 on Vertex AI

1.7k Views Asked by At

I am currently using kfp.dsl.ParallelFor to train 300 models. It looks something like this:

...

models_to_train_op = get_models()

with dsl.ParallelFor(models_to_train_op.outputs["data"], parallelism=100) as item:
  prepare_data_op = prepare_data(item)
  train_model_op = train_model(prepare_data_op["train_data"]

...

Currently, the iterations in Vertex AI are labeled in a dropdown as something like for-loop-worker-0, for-loop-worker-1, and so on. For tasks (like prepare_data_op, there's a function called set_display_name. Is there a similar method that allows you to set the iteration name? It would be helpful to relate them to the training data so that it's easier to look through the dropdown UI that Vertex AI provides.

1

There are 1 best solutions below

3
On

I reached out to a contact I have at Google. They recommended that you can pass the list that is passed to ParallelFor to set_display_name for each 'iteration' of the loop. When the pipeline is compiled, it'll know to set the corresponding iteration.

# Create component that returns a range list
model_list_op = model_list(n_models)

# Parallelize jobs
ParallelFor(model_list_op.outputs["model_list"], parallelism=100) as x:
  x.set_display_name(str(model_list_op.outputs["model_list"]))