Starting triton inference server docker container on kube cluster

696 Views Asked by At

Description Trying to deploy the triton docker image as container on kubernetes cluster

Triton Information What version of Triton are you using? -> 22.10

Are you using the Triton container or did you build it yourself? I used the server repo with following command:

python3 compose.py --backend onnxruntime --backend python --backend tensorflow2 --repoagent checksum --container-version 22.10

then again created new triton image with following dockerfile:

FROM tritonserver:latest
RUN apt install python3-pip -y
RUN pip install tensorflow==2.7.0
RUN pip install transformers==2.11.0
RUN pip install tritonclient
RUN pip install tritonclient[all]

and dockerfile is being with following command:

docker build -t customtritonimage -f ./DockerFiles/DockerFile  .

To Reproduce directory structure: parent directory -> tritonnludeployment files in it -> DockerFiles (folder containing docker files), k8_trial.yaml, model_repo_triton (all the models here in triton-supported directory shape and has required files)

I am using this 'k8_trial.yaml' file for starting kubectl deployment

apiVersion: apps/v1
kind: Deployment
metadata:
    name: flower
    labels:
      app: flower
spec:
    replicas: 3
    selector:
      matchLabels:
        app: flower
    template:
      metadata:
        labels:
          app: flower
      spec:
        volumes:
        - name: models
          hostPath:
            # server: 216.48.183.17
            path: /root/Documents/tritonnludeployment
            # readOnly: false
            type: Directory
        containers:
          - name: flower
            ports:
            - containerPort: 8000
              name: http-triton
            - containerPort: 8001
              name: grpc-triton
            - containerPort: 8002
              name: metrics-triton
            image: "customtritonimage:latest"
            imagePullPolicy: Never
            volumeMounts:
            - mountPath: /root/Documents/tritonnludeployment
              name: models
            command: ["/bin/sh", "-c"]
            args: ["cd /models /opt/tritonserver/bin/tritonserver --model-repository=/models/model_repo_triton --allow-gpu-metrics=false --strict-model-config=false"]
            # resources:
            #   requests:
            #     memory: "500Mi"
            #     cpu: "500Mi"
            #   limits:
            #     memory: "900Mi"
            #     cpu: "900Mi"
            #     nvidia.com/gpu: 1

Describe the models (framework, inputs, outputs), ideally include the model configuration file (if using an ensemble include the model configuration file for that as well).

Expected behavior kubectl deployment should start, with triton container as one of the pods

Which step i am doing wrong!

1

There are 1 best solutions below

1
On

And what is the error message you are getting? Some of the issues I noticed:

  • use the expected file name know to docker, i.e. Dockerfile not DockerFile
  • make sure base image exists (tritonserver:latest does not, you probably want one of these)
  • first update the sources (RUN apt install ... -> RUN apt update && apt install ...)
  • reduce layers number by installing multiple python packages at once
  • tritonclient[all] already includes tritonclient
  • don't run containers as root (tritonserver does not require it anyway)
  • make sure you pull the image first time (imagePullPolicy: Never -> IfNotPresent)
  • remove multiple and unnecessary commands from args (such as cd /models)
  • tritonserver can import all subfolders, so --model-repository=/models is probably better