I need to clasify some text in labels of emotions. I'm using Multi-Label Classification because the same text can contain more than one emotion, but I want to implement that some of them be disjoint like happy/sad or calm/angry.
Let's imagine that I have this code in Python:
from simpletransformers.classification import (
MultiLabelClassificationModel, MultiLabelClassificationArgs
)
model_args = MultiLabelClassificationArgs(num_train_epochs=1)
# Create a MultiLabelClassificationModel
model = MultiLabelClassificationModel(
"roberta", "roberta-base", num_labels=4,
)
with this sample:
train_data = [
["AAA", [1, 0, 0, 1]],
["BBB", [0, 1, 1, 1]],
["CCC", [1, 0, 1, 1]],
]
and I want to set the first and second labels must be disjoint. How I could do it?
I suggest putting efforts into post-preprocessing (i.e. after obtaining the prediction logits).
Assuming you have conducted threshold selection (maybe through ROC curve), one possible way:
In conclusion, be flexible and focus on your "business" objectives (or higher level objectives) after obtaining the logits from ML model! :)