I'm training a toy model in python, saving the model binary to disk, and then loading it into java (kotlin) and evaluating. My predictions don't agree between python and kotlin. Anyone know what I'm doing wrong?
import catboost as cb
import pandas as pd
x = pd.DataFrame(data={'a': [1, 3, 4, 99, 12],
'b': [0.5, 0, 1.3, 3, 44],
'c': [0.5, 0, 1.3, 0.91, 0],
'd': ['a', 'b', 'c', 'd', 'e']
})
y = pd.DataFrame(data={'y': [1.23, 3.2, 1.0, 1.5, 0.2]})
model = cb.CatBoostRegressor()
model.fit(x, y, cat_features=[3], verbose=False, plot=False)
proba = model.predict([1, 0.5, 0.5, 'c'])
print(proba) # 1.2274747745772228
model.save_model('./very_basic_model.cbm')
val model = CatBoostModel.loadModel('~/path/very_basic_model.cbm')
val floatFeatures = floatArrayOf(
1.0f,
0.5f,
0.5f
)
val categoricalFeatures:Array<String> = arrayOf("c")
val pred = model.predict(floatFeatures, categoricalFeatures).get(0,0)
System.out.println(pred) # -0.198525224469103
The answer is that the catboost java api wasn't correctly implemented until a fairly recent version. When I updated to version "0.24", I could confirm parity between python and java.