How to convert a Pandas DataFrame into a valid MLserver Predict V2-encoded payload?

430 Views Asked by At

I recently found the KServe and MLserver projects which are open source tools for serving ML models. These are great. What's not so great is that these both use a (new to me) and novel formatting for inference inputs, documented here: https://kserve.github.io/website/modelserving/inference_api/

An input looks like

{
  "id" : "42",
  "inputs" : [
    {
      "name" : "input0",
      "shape" : [ 2, 2 ],
      "datatype" : "UINT32",
      "data" : [ 1, 2, 3, 4 ]
    },
    {
      "name" : "input1",
      "shape" : [ 3 ],
      "datatype" : "BOOL",
      "data" : [ true ]
    }
  ]
}

While I understand this format from the docs, I don't understand how I'm supposed to easily convert a Pandas DataFrame into this format. I've looked online for "Dataframe to MLserve V2 format converter" but I can't find anything.

Does anyone know how I would go about making this conversion? Surely I wouldn't have to write my own.. right?

1

There are 1 best solutions below

0
On BEST ANSWER

The V2 Inference Protocol can be thought of as a lower-level spec. It doesn't try to define how to encode higher-level data types (e.g. a Pandas Dataframe) and leaves this to the inference servers themselves.

Based on this, MLServer introduces its own conventions which, if followed, ensure that the payload gets converted into a higher-level Python data type. These are covered in the Content Types section of the docs.

In particular, for Pandas Dataframes, the simplest way would be to use the "codecs" which were introduced in MLServer 1.1.0. These include a set of helpers which let you do something like:

import pandas as pd

from mlserver.codecs import PandasCodec

foo = pd.DataFrame({
  "A": ["a1", "a2", "a3", "a4"],
  "B": ["b1", "b2", "b3", "b4"],
  "C": ["c1", "c2", "c3", "c4"]
})

v2_request = PandasCodec.encode_request(foo)

Alternatively, you can also craft your own payload following the rules outlined in the docs (i.e. each column goes into a separate input, etc.).