I am trying to implement an ordinal classifier in a training exercise and am having some trouble. I cannot use one vs all classifier because my classes are ordinal. There is no function for ordinal classifiers so i found this code below on the internet. (source: https://towardsdatascience.com/simple-trick-to-train-an-ordinal-regression-with-any-classifier-6911183d2a3c).
Im confused on how i am supposed to use it though... i have a training and testing data set...but how do i incorporate those? for example, for logistic regression i understand you would have code like this:
model = LogisticRegression()
model.fit(x_train, y_train)
but how do i use this code? and how do i get the probabilities?
code from website:
from sklearn.base import clone
class OrdinalClassifier():
def __init__(self, clf):
self.clf = clf
self.clfs = {}
def fit(self, X, y):
self.unique_class = np.sort(np.unique(y))
if self.unique_class.shape[0] > 2:
for i in range(self.unique_class.shape[0]-1):
# for each k - 1 ordinal value we fit a binary classification problem
binary_y = (y > self.unique_class[i]).astype(np.uint8)
clf = clone(self.clf)
clf.fit(X, binary_y)
self.clfs[i] = clf
def predict_proba(self, X):
clfs_predict = {k:self.clfs[k].predict_proba(X) for k in self.clfs}
predicted = []
for i,y in enumerate(self.unique_class):
if i == 0:
# V1 = 1 - Pr(y > V1)
predicted.append(1 - clfs_predict[y][:,1])
elif y in clfs_predict:
# Vi = Pr(y > Vi-1) - Pr(y > Vi)
predicted.append(clfs_predict[y-1][:,1] - clfs_predict[y][:,1])
else:
# Vk = Pr(y > Vk-1)
predicted.append(clfs_predict[y-1][:,1])
return np.vstack(predicted).T
def predict(self, X):
return np.argmax(self.predict_proba(X), axis=1)
Faced some errors when running the code, so I do some changes to the code:
So back to your question:
You can easily implement the code like this:
The output will be the predicted class labels for the test set. You can hence call
sklearn
's confusion matrix to check the accuracy etc.You can get probabilities for each class as below:
You will get the probabilities of each class in
numpy
's 2d array with m x n dimension where m is the number of instances and n is the number of classes