Can we set task wise parameters using Databricks Jobs API "run-now"

1k Views Asked by At

I have a job with multiple tasks like Task1 -> Task2. I am trying to call the job using api "run now". Task details are below

Task1 - It executes a Note Book with some input parameters

Task2 - It executes a Note Book with some input parameters

So, how I can provide parameters to job api using "run now" command for task1,task2?

I have a parameter "lib" which needs to have values 'pandas' and 'spark' task wise.

I know that we can give unique parameter names like Task1_lib, Task2_lib and read that way.

current way: json = {"job_id" : 3234234, "notebook_params":{Task1_lib: a, Task2_lib: b}}

Is there a way to send task wise parameters?

2

There are 2 best solutions below

0
On

It's not supported right now - parameters are defined on the job level. You can ask your Databricks representative (if you have) to communicate this ask to the product team who works on the Databricks Workflows.

0
On

A bit of a hack but I managed to do it by first updating the tasks using the updates API and then running the job.

Here is a rough template of how in python:

import requests

token = TOKEN
workspace = WORKSPACE
job_id = JOB_ID
headers = {"Authorization": f"Bearer {token}"}
url = f"{workspace}/api/2.0/jobs/update"

tasks = [{
    "task_key": "task_1",
    "notebook_task": {
        "notebook_path": "/path/to/notebook",
        "base_parameters": {
            "param_1": "new param here",
            "param_2":  "new param here",
        },
        "source": "WORKSPACE",
    },
    "job_cluster_key": "Job_cluster",
    "timeout_seconds": 0,
},
{
    "task_key": "task_2",
    "notebook_task": {
        "notebook_path": "/path/to/notebook",
        "base_parameters": {
            "param_1": "new param here",
            "param_2":  "new param here",
        },
        "source": "WORKSPACE",
    },
    "job_cluster_key": "Job_cluster",
    "timeout_seconds": 0,
}]


json = {
    "job_id": job_id,
    "new_settings" : {"tasks": tasks}
}

resp = requests.post(url=url, headers=headers, json=json)