I am trying to calculate the Accuracy and Specificity of a NER model using spaCy's API. The scorer.scores(example) method found here computes the Recall, Precision and F1_Score for the spans predicted by the model, but does not allow for the extrapolation of TP, FP, TN, or FN.
Below is the code I have currently written, with an example of the data structure I am using when passing my expected found entites into the model.
Code Being Used to Score the Model:
import spacy
from spacy.scorer import Scorer
from spacy.training.example import Example
scorer = Scorer()
example = []
for obs in example_list:
print('Input for a prediction:', obs['full_text'])
pred = custom_nlp(obs['full_text']) ## custom_nlp is the custome model I am using to generate docs
print('Predicted based off of input:', pred, '// Entities being reviewed:', obs['entities'])
temp = Example.from_dict(pred, {'entities': obs['entities']})
example.append(temp)
scores = scorer.score_spans(example, "ents")
The data structure I am currently using to load the Example class (list of dictionaries): example_list[0] {'full_text': 'I would like to remove my kid Florence from the will. How do I do that?', 'entities': [(30, 38, 'PERSON')]}
The result that I am returning from running print(scores) is as expected; a dictionary of tokenization's precision, recall, f1_score, as well as the entity recognition's precision, recall and f1_score.
{'ents_p': 0.8731019522776573,
'ents_r': 0.9179019384264538,
'ents_f': 0.8949416342412452,
'ents_per_type': {'PERSON': {'p': 0.9039145907473309,
'r': 0.9694656488549618,
'f': 0.9355432780847145},
'GPE': {'p': 0.7973856209150327,
'r': 0.9384615384615385,
'f': 0.8621908127208481},
'STREET_ADDRESS': {'p': 0.8308457711442786,
'r': 0.893048128342246,
'f': 0.8608247422680412},
'ORGANIZATION': {'p': 0.9565217391304348,
'r': 0.7415730337078652,
'f': 0.8354430379746837},
'CREDIT_CARD': {'p': 0.9411764705882353, 'r': 1.0, 'f': 0.9696969696969697},
'AGE': {'p': 1.0, 'r': 1.0, 'f': 1.0},
'US_SSN': {'p': 1.0, 'r': 1.0, 'f': 1.0},
'DOMAIN_NAME': {'p': 0.4, 'r': 1.0, 'f': 0.5714285714285715},
'TITLE': {'p': 0.8709677419354839, 'r': 0.84375, 'f': 0.8571428571428571},
'PHONE_NUMBER': {'p': 0.8275862068965517,
'r': 0.8275862068965517,
'f': 0.8275862068965517},
'EMAIL_ADDRESS': {'p': 1.0, 'r': 1.0, 'f': 1.0},
'DATE_TIME': {'p': 1.0, 'r': 1.0, 'f': 1.0},
'NRP': {'p': 1.0, 'r': 1.0, 'f': 1.0},
'IBAN_CODE': {'p': 1.0, 'r': 1.0, 'f': 1.0},
'IP_ADDRESS': {'p': 0.75, 'r': 0.75, 'f': 0.75},
'ZIP_CODE': {'p': 0.8333333333333334,
'r': 0.7142857142857143,
'f': 0.7692307692307692},
'US_DRIVER_LICENSE': {'p': 1.0, 'r': 1.0, 'f': 1.0}}}
How can I extrapolate the TP, FP, TN and FN from this function using some form of an attribute?
Copied from https://github.com/explosion/spaCy/discussions/12682#discussioncomment-6036758:
I asked the moderators in spacy with this: Reference:https://github.com/explosion/spaCy/discussions/12682