How to predict Precision, Recall and F1 score after training SSD

1.5k Views Asked by At

I am new to deep learning and I want to be able to evaluate the model which underwent training for certain epochs using F1 score. I believe it requires the calculation of precision and recall first.

The model trained is SSD-300-Tensorflow. Is there a code or something that can produce these results? I am not using sci-kit or anything as I am not sure if that is required to calculate the score but any guidance is appreciated.

There is a file for evaluation in folder tf_extended called metrics.py which has a code for precision and recall. After training my files, I have the checkpoints in logs folder. How can I calculate my metrics? I am using Google collab due to hardware limitations (GPU issues)

2

There are 2 best solutions below

6
On BEST ANSWER

If you are using the Tensorflow Object Detection API, it provides a way for running model evaluation that can be configured for different metrics. A tutorial on how to do this is here.

The COCO evaluation metrics includes analogous measures of precision and recall for object detection use cases. A good overview of these metrics is here. The concepts of precision and recall need to be adapted somewhat for object detection scenarios because you have to define "how closely" a predicted bounding box needs to match the ground truth bounding box to be considered a true positive.

I am not certain what the analog for F1 score would be for object detection scenarios. Typically, I have seen models compared using mAP as the single evaluation metric.

2
On

You should first calculate False positive, False negatives, True positive and True negatives. To obtain these values you must evaluate your model with test dataset. This link might help

with these formulas you can calculate precision and recall and here is some example code:

y_hat = []
y = []
threshold = 0.5
for data, label in test_dataset:
    y_hat.extend(model.predict(data))
    y.extend(label.numpy()[:, 1])
    y_hat = np.asarray(y_hat)
    y = np.asarray(y)
    m = len(y)
    y_hat = np.asarray([1 if i > threshold else 0 for i in y_hat[:, 1]])
    true_positive = np.logical_and(y, y_hat).sum()
    true_negative = np.logical_and(np.logical_not(y_hat), np.logical_not(y)).sum()
    false_positive = np.logical_and(np.logical_not(y), y_hat).sum()
    false_negative = np.logical_and(np.logical_not(y_hat), y).sum()
    total = true_positive + true_negative + false_negative + false_positive
    assert total == m
    precision = true_positive / (true_positive + false_positive)
    recall = true_positive / (true_positive + false_negative)
    accuracy = (true_positive + true_negative) / total
    f1score = 2 * precision * recall / (precision + recall)