I am trying to run a google cloud batch job using a private docker image from dockerhub. The job always fails to pull the image with exit code 125 and in the logs I can see:
docker: Error response from daemon: pull access denied for privateorg/image, repository does not exist or may require 'docker login': denied: requested access to the resource is denied.
This is the task spec I'm using as per the documentation:
{
"taskSpec": {
"runnables": [
{
"container": {
"imageUri": "privateorg/image:tag",
"entrypoint": "/bin/sh",
"commands": [
"-c",
"sleep 1h"
],
"username": "${dockerhub_user}",
"password": "${dockerhub_pass}"
}
}
]
},
"taskCount": 1,
"parallelism": 1
}
I am not sure it is relevant, but the job spec does not include a custom instance template. It only uses instance policy:
"allocationPolicy": {
"instances": [
{
"installGpuDrivers": true,
"policy": {
"machineType": "a2-highgpu-1g",
"provisioning_model": "STANDARD",
"accelerators": [
{
"type": "nvidia-tesla-a100",
"count": "1"
}
]
}
}
],
"location": {
"allowedLocations": [
"zones/me-west1-b",
"zones/me-west1-c"
]
},
"serviceAccount": {
"email": "${batch_service_account_email}"
}
},
I am aware I also need to supply options to the docker image in order to be able to use the GPU, but I've not gone to that point considering I cannot even run the docker image.
I have verified the credentials are correct by creating a job with runnable script with command sleep 1h and manually running docker login/pull/run. Prior to running docker login there were no credentials in the instance.
I have tried to supply the full image url, as well as with explicit latest tag.
I have tried to pull public image, which obviously works - but I can still not see any docker login in the logs.