Convert an instance of xgboost.Booster into a model that implements the scikit-learn API

2.6k Views Asked by At

I am trying to use mlflow to save a model and then load it later to make predictions.

I'm using a xgboost.XGBRegressor model and its sklearn functions .predict() and .predict_proba() to make predictions but it turns out that mlflow doesn't support models that implements the sklearn API, so when loading the model later from mlflow, mlflow returns an instance of xgboost.Booster, and it doesn't implements the .predict() or .predict_proba() functions.

Is there a way to convert a xgboost.Booster back into a xgboost.sklearn.XGBRegressor object that implements the sklearn API functions?

2

There are 2 best solutions below

1
On BEST ANSWER

Have you tried wrapping up your model in custom class, logging and loading it using mlflow.pyfunc.PythonModel? I put up a simple example and upon loading back the model it correctly shows <class 'xgboost.sklearn.XGBRegressor'> as a type.

Example:

import xgboost as xgb
xg_reg = xgb.XGBRegressor(...)

class CustomModel(mlflow.pyfunc.PythonModel):
    def __init__(self, xgbRegressor):
        self.xgbRegressor = xgbRegressor

    def predict(self, context, input_data):
        print(type(self.xgbRegressor))
        
        return self.xgbRegressor.predict(input_data)

# Log model to local directory
with mlflow.start_run():
     custom_model = CustomModel(xg_reg)
     mlflow.pyfunc.log_model("custome_model", python_model=custom_model)


# Load model back
from mlflow.pyfunc import load_model
model = load_model("/mlruns/0/../artifacts/custome_model")
model.predict(X_test)

Output:

<class 'xgboost.sklearn.XGBRegressor'>
[ 9.107417 ]
0
On

I have a xgboost.core.Booster object and it can make return probability calculations as follows your_Booster_model_object.predict(your_xgboost_dmatrix_dataset).