I have a Google Cloud Platform account with a Kubeflow Pipeline with 3 components.
I'm creating docker images for each component with cloudbuild and pushing those images to the container registry. However, when I try to run the pipeline, the Kubeflow interface shows this message:
This step is in Pending state with this message: ImagePullBackOff: Back-off pulling image "gcr.io"
Followed by this error:
This step is in Pending state with this message: ErrImagePull: rpc error: code = Unknown desc = Error response from daemon: pull access denied for gcr.io, repository does not exist or may require 'docker login': denied: requested access to the resource is denied
These are the configurations of cloudbuild to run the pipeline:
# Run Pipeline
- name: 'gcr.io/<PROJECT_ID>/kfp-util:latest'
entrypoint: 'python3'
args: ['pipelines/classification-pipeline/pipeline/pipeline.py',
'--conf_file','configurations/classification_pipeline_config.yml',
'--componentstore', 'pipelines/classification-pipeline/components',
'--operation', 'run_pipeline']
dir: 'implementation'
id: 'Run Pipeline'
waitFor: ['Create or update Pipeline']
env:
- 'KFP_CLIENT_HOST=${_KFP_CLIENT_HOST}'
- 'KFP_PIPELINE_NAME=${_KFP_PIPELINE_NAME}'
- 'PROJECT_ID=${_PROJECT_ID}'
images:
- 'gcr.io/${_PROJECT_ID}/preprocess_data:${_TAG}'
- 'gcr.io/${_PROJECT_ID}/train_model:${_TAG}'
- 'gcr.io/${_PROJECT_ID}/test_model:${_TAG}'
This is the build.py file:
import subprocess
import docker
import argparse
import yaml
# Get arguments
parser = argparse.ArgumentParser()
parser.add_argument("--config", help="Build Configuration Path")
args = parser.parse_args()
with open(args.config, "r") as file:
config = yaml.load(file, Loader=yaml.FullLoader)
client = docker.from_env()
# Create the kfp-util docker container image
project = config['build']['project_id']
client.images.build(tag=str(f"gcr.io/{project}/kfp-util:latest"), path=".")
subprocess.run(f"gcloud docker -- push gcr.io/{project}/kfp-util:latest", shell=True, check=True)
# Set substitutions
substitutions = (
f"_PROJECT_ID={project}"
)
# Submit the build job
_cmd = f"gcloud builds submit --no-source --config {config['build']['cloudbuild']} --substitutions {substitutions}"
subprocess.run(_cmd, shell=True, check=True)
Any idea why this error is happening?
Thanks in advance!
the problem was in the image specified on the component.yaml of each pipeline component.
Instead of the whole path (gcr.io/<project_ID>/preprocess_data:<_tag>), I only had "gcr.io".
After this correction, the pipeline executed as expected.