I'm very new to Google Cloud Platform. I'm trying to schedule a job using Cloud scheduler where the job has to run which will pick a script called "pipleline.py" in my dataflow console
I'm unable to understand the link, or understanding what should the URL be when creating a cloud scheduler job. Please help me how to go about it
There is an excellent example here to schedule Dataflow jobs with Cloud Scheduler. It uses Terraform to create the Cloud Scheduler resource as you can see here:
If you are not familiar with Terraform, you could just use the gcloud SDK to accomplish the same thing:
The
dataflow_message_body.json
contains a json similar to:And if you want to do this in the console, just go to your project and then create a new Cloud Scheduler, which has the same fields as described above.
If you want to know how Google-provided templates look like, you could take a look here. When you want to know how to create your own templates and the format they should have, take a look here. When starting a Dataflow job, you always refer to a location in a bucket, whether it is the Google-provided
gs://dataflow-templates
bucket or your own bucket.