Not being able to inference TensorFlow model hosted by kfserving component in Kubeflow

378 Views Asked by At

Hi so I am using kfserving v.0.5.1 component for hosting the model. I am able to download and deploy model from s3 but facing issue when try to access it.

After deployment kfserving outputted the following endpoint

http://recommendation-model.kubeflow-user-example-com.example.com

which I was not able to access it from outside and inside the node. After looking around I set my ingress-gateway to LoadBalancer from NodePort and added sslip.io to config-map config-domain of knative-serving

I followed knative-dn-config

<IPofMyLB>.sslip.io: ""

after that I try to inference model but getting no error or response from server

curl -d '{"instances": ["abc"]}'   -X POST http://recommendation-model.kubeflow-user-example-com.<IPofMyLB>.sslip.io/v1/models/recommendation-model:predict

I try to simply get the inference endpoint and output is this:

curl -v -X GET http://recommendation-model.kubeflow-user-example-com.ELBIP.sslip.io
Note: Unnecessary use of -X or --request, GET is already inferred.
*   Trying ELBIP...
* TCP_NODELAY set
* Connected to recommendation-model.kubeflow-user-example-com.ELBIP.sslip.io (ELBIP) port 80 (#0)
> GET / HTTP/1.1
> Host: recommendation-model.kubeflow-user-example-com.ELBIP.sslip.io
> User-Agent: curl/7.64.1
> Accept: */*
>
< HTTP/1.1 302 Found
< content-type: text/html; charset=utf-8
< location: /dex/auth?client_id=kubeflow-oidc-authservice&redirect_uri=%2Flogin%2Foidc&response_type=code&scope=profile+email+groups+openid&state=MTYyMjY0MjU5M3xFd3dBRUV4MVVERkljREpRVUc1SVdXeDFaVkk9fOEnkjCWGNj6WPOgFhv2BUwNSKHsYyBR2kyj9_0geX2f
< date: Wed, 02 Jun 2021 14:03:13 GMT
< content-length: 269
< x-envoy-upstream-service-time: 1
< server: istio-envoy
<
<a href="/dex/auth?client_id=kubeflow-oidc-authservice&amp;redirect_uri=%2Flogin%2Foidc&amp;response_type=code&amp;scope=profile+email+groups+openid&amp;state=MTYyMjY0MjU5M3xFd3dBRUV4MVVERkljREpRVUc1SVdXeDFaVkk9fOEnkjCWGNj6WPOgFhv2BUwNSKHsYyBR2kyj9_0geX2f">Found</a>.

* Connection #0 to host recommendation-model.kubeflow-user-example-com.ELBIP.sslip.io left intact
* Closing connection 0
(base) ahsan@Ahsans-MacBook-Pro kfserving % curl -v -X GET http://recommendation-model.kubeflow-user-example-com.ELBIP.sslip.io
Note: Unnecessary use of -X or --request, GET is already inferred.
*   Trying ELBIP...
* TCP_NODELAY set
* Connected to recommendation-model.kubeflow-user-example-com.ELBIP.sslip.io (ELBIP) port 80 (#0)
> GET / HTTP/1.1
> Host: recommendation-model.kubeflow-user-example-com.ELBIP.sslip.io
> User-Agent: curl/7.64.1
> Accept: */*
>
< HTTP/1.1 302 Found
< content-type: text/html; charset=utf-8
< location: /dex/auth?client_id=kubeflow-oidc-authservice&redirect_uri=%2Flogin%2Foidc&response_type=code&scope=profile+email+groups+openid&state=MTYyMjY0MzE3OXxFd3dBRUhOdmFrSk9iVkU0Wms1VmMzVnhXbkU9fECJ3_U0SaWkR441eIWq-AJbFAV29-2Bk8uxPAOxPJD0
< date: Wed, 02 Jun 2021 14:12:59 GMT
< content-length: 269
< x-envoy-upstream-service-time: 1
< server: istio-envoy
<
<a href="/dex/auth?client_id=kubeflow-oidc-authservice&amp;redirect_uri=%2Flogin%2Foidc&amp;response_type=code&amp;scope=profile+email+groups+openid&amp;state=MTYyMjY0MzE3OXxFd3dBRUhOdmFrSk9iVkU0Wms1VmMzVnhXbkU9fECJ3_U0SaWkR441eIWq-AJbFAV29-2Bk8uxPAOxPJD0">Found</a>.

* Connection #0 to host recommendation-model.kubeflow-user-example-com.ELBIP.sslip.io left intact
* Closing connection 0

model directory structure

recommendation_model/
└── 1
    ├── assets
    ├── keras_metadata.pb
    ├── saved_model.pb
    └── variables
        ├── variables.data-00000-of-00001
        └── variables.index

Not sure how to get serving to work as this is the last part of my pipeline

1

There are 1 best solutions below

0
On

When visiting the inferenceservice through the LoadBalancer, essentially the istio-ingressgateway, your request has an extra layer of control compared to the NodePort, which is dictated by the Istio security policy.

The response message of your curl indicates that you have an Istio installation with DEX authentication.

The istio-dex guide has examples of how to set the cookie for authenticating your inference request.