I am working on relation extraction problem using T5 encoder decoder model with the prefix as 'summary'. I have fine tuned the model but i am confuse about the evaluation metrics to evaluate my results.
are there any statistical measures exist? i have read about rouge but its not fine as for me direction of triplets are import
for example: text = "company A is acquired by B" so the prediction should be " A | B | Acquired-Acquiree "
how can i evaluate these results?
For Relation extraction, I don't think Rouge would be ideal. At the end of the day Relation extraction is a token classification task, so you can use Recall, Precision and off course F1 score. There is also ign F1 that is used often specifically for relation extraction, which is basically F1 score while ignoring the relational facts in the training data.
To make sure, I visited the relation extraction page on Papers with code and sure enough they use F1 and ign F1 for their benchmarks, and same thing for most papers in the field.