Is it possible to use another model within Nvidia Triton Inference Server model repository with a custom Python model?

1.7k Views Asked by At

I want to use a model in my Triton Inference Server model repository in another custom Python model that I have in the same repository. Is it possible? If yes, how to do that?

I guess it could be done with Building Custom Python Backend Stub, but I was wondering if there is a simpler way.

1

There are 1 best solutions below

0
On

Yes.

You can construct InferenceRequest and call exec() method to use another model in the model repository.

Here is code snippet:

inference_request = pb_utils.InferenceRequest(
    model_name='model_name',
    requested_output_names=['output0', 'output1'],
    inputs=[pb_utils.Tensor('input0', input0.astype(np.float32))]
)
inference_response = inference_request.exec()
output0 = pb_utils.get_output_tensor_by_name(inference_response, 'output0')
output1 = pb_utils.get_output_tensor_by_name(inference_response, 'output1')

Here is a relatively complete example.

import numpy as np
import triton_python_backend_utils as pb_utils
import utils


class facenet(object):
    def __init__(self):
        self.Facenet_inputs =  ['input_1']
        self.Facenet_outputs =  ['Bottleneck_BatchNorm']

    def calc_128_vec(self, img):
        face_img = utils.pre_process(img)
        inference_request = pb_utils.InferenceRequest(
            model_name='facenet',
            requested_output_names=[self.Facenet_outputs[0]],
            inputs=[pb_utils.Tensor(self.Facenet_inputs[0], face_img.astype(np.float32))]
        )
        inference_response = inference_request.exec()
        pre = utils.pb_tensor_to_numpy(pb_utils.get_output_tensor_by_name(inference_response, self.Facenet_outputs[0]))
        pre = utils.l2_normalize(np.concatenate(pre))
        pre = np.reshape(pre, [128])
        
        return pre

You can find more reference here: https://github.com/triton-inference-server/python_backend#business-logic-scripting-beta