How to increase AWS Sagemaker invocation time out while waiting for a response

22k Views Asked by Stiefel At 16 October 2025 at 16:02

I deployed a large 3D model to aws sagemaker. Inference will take 2 minutes or more. I get the following error while calling the predictor from Python:

An error occurred (ModelError) when calling the InvokeEndpoint operation: Received server error (0) from model with message "Your invocation timed out while waiting for a response from container model. Review the latency metrics for each container in Amazon CloudWatch, resolve the issue, and try again."'

In Cloud Watch I also see some PING time outs while the container is processing:

2020-10-07T16:02:39.718+02:00 2020/10/07 14:02:39 https://forums.aws.amazon.com/ 106#106: *251 upstream timed out (110: Connection timed out) while reading response header from upstream, client: 10.32.0.2, server: , request: "GET /ping HTTP/1.1", upstream: "http://unix:/tmp/gunicorn.sock/ping", host: "model.aws.local:8080"

How do I increase the invocation time out?

Or is there a way to make async invocations to an sagemaker endpoint?

Original Q&A

There are 2 best solutions below

pygeek On 10 October 2020 at 12:08 BEST ANSWER

It’s currently not possible to increase timeout—this is an open issue in GitHub. Looking through the issue and similar questions on SO, it seems like you may be able to use batch transforms in conjunction with inference.

References

https://stackoverflow.com/a/55642675/806876

Sagemaker Python SDK timeout issue: https://github.com/aws/sagemaker-python-sdk/issues/1119

Jim On 03 December 2020 at 13:43

This timeout is actually specified at server side - endpoint to be specific. You can try the way of bring your own container also known as BYOC, this way you get full control of everything on endpoint side including the timeout.

You can also reference the endpoint part of this repo which is from one of my colleague - https://github.com/jackie930/yolov4-SageMaker

The timeout you should change exists in serve.py model_server_timeout = os.environ.get('MODEL_SERVER_TIMEOUT', 60)

How to increase AWS Sagemaker invocation time out while waiting for a response

There are 2 best solutions below

References

Related Questions in PYTHON

Related Questions in AMAZON-WEB-SERVICES

Related Questions in TIMEOUT

Related Questions in AMAZON-SAGEMAKER

Related Questions in INFERENCE

Trending Questions

Popular # Hahtags

Popular Questions