I have a requirement to have separate ML models to predict values for the input data. For this, I am keeping a dictionary that stores model based on the key.
I also have a generic model that can predict if the key is not in the dictionary.
I have trained my model like this:
data = {
'Col1': ['C11', 'C12', 'C11', 'C13'],
'Col4': [1, 2, 5, 2],
'Col5': [6, 7, 4, 2],
'Col6': [1, 2, 3, 4]
}
df = pd.DataFrame(data)
# creating a dictionary to store models based on col1 values.
# we have separate ml model for each column value
models={}
for col1_value in df['Col1'].unique():
subset = df[df['Col1'] == col1_value]
X = subset[['Col4', 'Col5']]
y = subset['Col6']
model= PMMLPipeline([
('polynomial_features', PolynomialFeatures(degree=2)),
('linear_regression',LinearRegression())
])
model.fit(X,y)
models[col1_value] = model
# Generic model which does prediction using col4 and col5
X = df[['Col4', 'Col5']]
y = df['Col6']
model= PMMLPipeline([
('polynomial_features', PolynomialFeatures(degree=2)),
('linear_regression',LinearRegression())
])
model.fit(X, y)
For prediction I check if a value exists in the dictionary, if not, I use a generic model.
def predict(row):
ml_model=models.get(row["col1"],generic_model)
return ml_model.predict(row)
How do I export this into a PMML file which I'll be using in Django to create a rest endpoint?