Integrating custom pytorch backend with triton + AWS sagemaker

433 Views Asked by At

I have a custom python backend that works well with AWS sagemaker MMS (multimodel server) using an S3 model repository. I want to adapt this backend to work with Triton python backend. I have a example dockerfile that runs the triton server with my requirements.

I also have a model_handler.py file that is based on this example, but I do not understand where to place this file to test it's functionality. Using classic sagemaker with MMS for example, I would import the handler in the dockerd-entrypoint.

However with triton, I do not understand where this file should be imported. I understand I can use pytriton, but there is absolutely no documentation that I can comprehend. Can someone point me in the right direction please?

1

There are 1 best solutions below

0
On

For Triton a custom inference script is expected in the form of a model.py file. This model.py implements the initialize method (model loading), execute, and finalize methods where you can implement your pre/post processing logic. For the custom python backend you can install any additional dependencies by using conda-pack to define the environment and install any dependencies. In your config.pbtxt you can point towards this environment you have defined and created. Example: https://github.com/aws/amazon-sagemaker-examples/tree/main/inference/nlp/realtime/triton/single-model/t5_pytorch_python-backend