I followed this notebook to train a PU bagging model using my dataset, but I always got 100% precision I thought the mistake in my dataset. However, I checked also the precision in this notebook which used randomly generated data the precision is always 100%. Is there anything wrong with the implementation? or the PU bagging model generally will give 100% precision. I didn't understand the reason behind that.
This is the code I added to print the precision on this notebook:
from sklearn.metrics import confusion_matrix, accuracy_score, precision_score, recall_score, f1_score
# Confusion matrix
conf_matrix = confusion_matrix(data_spy.y.to_list(), data_spy.pred_y_class, labels=clf.classes_)
print(conf_matrix)
# Accuracy, Precision, Recall, and F1 Score
accuracy = accuracy_score(data_spy.y.to_list(), data_spy.pred_y_class)
precision = precision_score(data_spy.y.to_list(), data_spy.pred_y_class, pos_label=1)
recall = recall_score(data_spy.y.to_list(), data_spy.pred_y_class, pos_label=1)
f1 = f1_score(data_spy.y.to_list(), data_spy.pred_y_class, pos_label=1)
print(f'Accuracy: {accuracy}')
print(f'Precision: {precision}')
print(f'Recall: {recall}')
print(f'F1 Score: {f1}')
This is the output:
[[ 0 0]
[122 682]]
Accuracy: 0.8482587064676617
Precision: 1.0
Recall: 0.8482587064676617
F1 Score: 0.917900403768506