How to implement a beam search decoder in an SageMaker hosting endpoint?

209 Views Asked by Brendan Johnson At 07 June 2025 at 11:21

I've created a SageMaker model for a Seq2Seq neural network, and then started a SageMaker endpoint:

create_endpoint_config_response = sage.create_endpoint_config(
    EndpointConfigName = endpoint_config_name,
    ProductionVariants=[{
        'InstanceType':'ml.m4.xlarge', 
        'InitialInstanceCount':1,
        'ModelName':model_name,
        'VariantName':'AllTraffic'}])

create_endpoint_response = sage.create_endpoint(
    EndpointName=endpoint_name,
    EndpointConfigName=endpoint_config_name)

This standard endpoint does not support beam search. What is the best approach for creating a SageMaker endpoint that supports beam search?

Original Q&A

There are 1 best solutions below

dennis-w On 01 August 2018 at 07:16

Based on your comment I believe the only solution is to create your own docker container for inference. This way you can load your already trained model and do with it whatever you like. This example is a good place to start when you want to learn about how to use docker in sagemaker.

For your use case it would be best to find the source code for the sagemaker builtin seq2seq model (the builtin algorithms are also just docker images), modify it to your needs, built the modified docker container and load it to your aws ecr, from where you can load it with sagemaker.

Unfortunately I don't know if the source code for the docker containers is publicly available (didn't find it on the first try).

How to implement a beam search decoder in an SageMaker hosting endpoint?

There are 1 best solutions below

Related Questions in AMAZON-SAGEMAKER

Related Questions in BEAM-SEARCH

Trending Questions

Popular # Hahtags

Popular Questions