We need to perform post-processing on all pipeline runs including evaluating the output of each cell. The intent is to audit the cell outputs directly - not to generate emails or other new artifacts.
A primary use case is to allow both automated and manual in-depth troubleshooting based on the notebook cell contents including examining many individual log entries as well as stacktraces.
Since the notebook could fail in one of a number of cells we would be looking for an API that allows access to the cell-level structure of the notebook output.
There is a related question here, however that question is dealing with generating an email to send specific results: Easily access notebook output run in Synapse pipeline . Instead we want the entire notebook outputs organized by sequential cell number via the azure api. How can this be done?
Update A promising lead is via the Azure REST API and specifically the output of Activity Runs - Query By Pipeline Run . I am working through how to set up the REST URL to try it out.
You can check out the details of your scheduled pipeline's notebook run in the "Pipeline Runs" section under the "Monitor" tab. click on the option that says 'Open Notebook snapshot,' which is represented by an icon resembling spectacles.
I have debugged the pipeline and also I have triggered the pipeline:
I have set up a notebook and linked it to the scheduled trigger, allowing me to see the results of each cell in the monitor tab. To view the output, navigate to the "
Monitor" application option under the pipeline run, choose the "Logs" tab, and examine the output in the "stdout" menu.The below is the output for each indvidual cell.
Once you click on the spectacles you will see like below
print()function. For additional information,Known more about Limitations Azure Synapse Analytics.As you mentioned You want the solution outputs programmatically to access the cell-level structure of a notebook's output using the Azure REST API The below are the following steps:
Step 1: You will have to obtain the pipeline run ID
SETP 2: Create the HTTP GET
Step 3: Send a GET request to the REST URL to retrieve the activity runs for the pipeline run.
you can extract the cell-level output by iterating over the value array and accessing the output field of each activity run.