tensorflow and libcudart library issue in AWS sagemaker endpoint creation

17 Views Asked by At

I am trying to deploy a NLP model in AWS Sagemaker Endpoint. The model image is stored in ECR and pushed into Sagemaker instance. While building the Docker image, No Error.

when creating the endpoint using the forum (https://docs.aws.amazon.com/sagemaker/latest/dg/realtime-endpoints-deploy-models.html : boto3 inference components) its throwing below error message:

tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory

tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.

I tried with copying the libcudart package from local to docker image and the runtime environment - No Result

Added the libcudart package in the requirement.txt - No result

What could be the potential issue and solution??

0

There are 0 best solutions below