Deploy ML Model on Azure ACI (Container) or AKS (Kubernetes)

825 Views Asked by At

I am exploring ways to serve my trained ML models in the most cost-effective way.

I currently have 4 different models, where the output of the 1st model form part of the input of the 2nd model, and so on.

The current user base is very small and the number of required inferences is small and sporadic. Ie. 2 - 3 times every few hours, and even some days with 0 inference.

First I deployed with ACI, but for some reason the container instance stayed running even when no one was accessing the endpoint. I was under the impression that the instance should stop itself to avoid billing unused hours.

Is this something to do with the model being deployed as a real-time endpoint? Will Kubernetes deployment be more suitable (as in will it scale down to 0 node) when the endpoint/model is unused?

2

There are 2 best solutions below

1
On

I’ll have to double check but I’m fairly certain you can’t scale down to zero nodes with a model deployed on AKS. Perhaps consider deploying to Azure Functions?

https://learn.microsoft.com/en-us/azure/machine-learning/how-to-deploy-functions

0
On

Kubernetes supports the scaling to zero only by means of an API call, since the Horizontal Pod Autoscaler does support scaling down to 1 replica only. Please find detailed explanations below ..

In Kubernetes, how can I scale a Deployment to zero when idle

AKS(Kubernetes) is preferred over Azure container instance for machine learning model deployment . Its possible to connect AKS to ACI, and use Kubernetes to handle orchestration and scale ..

If you are looking for multi model deployment on Azure container instance , please refer to below github link...

https://github.com/Azure/MachineLearningNotebooks/blob/master/how-to-use-azureml/deployment/deploy-multi-model/multi-model-register-and-deploy.ipynb