AWS SageMaker pass arguments to NotebookJobStep from Pipeline using start_pipeline_exection boto3 function

52 Views Asked by At

I am trying to programmatically trigger a sagemaker notebook using sagemaker resources from local. I came accross the NotebookJobStep in the sagemaker pipeline step. I have successfully created a pipeline and is able to run executions of the pipeline which runs an .ipynb notebook. The problem now is that I need to pass arguments from the start_pipeline_execution boto3 function, to inside the notebook. The only way to pass arguments to a notebook in a NotebookJobStep as far as I know is to use the parameters attribute during the creation of the NotebookJobStep and the Pipeline. For my use case I need to pass arguments from calling the start_pipeline_execution to the NotebookJobStep, so every pipeline execution can be fed with different parameters. Is there any way to do this?

What I have done:

  1. Create a sagemaker pipeline with a NotebookJobStep step
  2. Triggered the sagemaker pipeline using start_pipeline_execution boto3 function from local

What I want to do:

  1. Create a sagemaker pipeline with a NotebookJobStep step
  2. Trigger the sagemaker pipeline using start_pipeline_execution boto3 function from local, passing arguments using this function so that the notebook can use these arguments, without having to recreate pipeline
1

There are 1 best solutions below

0
Ram Vegiraju On

At the moment the simplest way to pass in any inputs is via the Pipeline Parameters, you can do this either at the Step level or the Pipeline level. I am not sure why you have to recreate a Pipeline as you are saying, my recommendation would be having an S3 Location with the said inputs that you need to pass in and define the Parameter Variable that you can inject at the appropriate level. The S3 path can also be used to store outputs that must be accessible for the next step for instance. An example utilizing parameters can be found here: https://aws.amazon.com/blogs/machine-learning/schedule-amazon-sagemaker-notebook-jobs-and-manage-multi-step-notebook-workflows-using-apis/.

In the case you want simple Python function chaining via a Pipeline, I would use the SageMaker Pipeline Step Decorator. Here you can define vanilla Python code and at the Step decorator, this is simpler to pass in inputs/outputs in a tradition Python manner without needing to use Pipeline parameters (can still use in this route as well): https://docs.aws.amazon.com/sagemaker/latest/dg/pipelines-step-decorator.html