I am trying to create a CM, I have got my y_pred and obviously I need my ground truths, or this I am trying to use testdata.classes (this is what they do online, testdata is an instance of imagedatagenerator).
However .classes seems to just return a sorted list of all of my classes rather than a list of classes that would correspond to my predictions. Due to this I think I get a very inaccurate CM. How can I get the ground truths for my predictions?
Here is an example of what I mean about the .classes, this list just goes in order 0-15. My model is 95% accurate incidentally, so I would expect these to line up much better.
I would expect y_pred and dataset.classes to be the same 95% of the time.
This is a common issue.
generator.classes
should not be used as ground truth labels, because they are not sorted the same way you would get predictions. So any metric you compute will be wrong.A general and correct way to do it is to iterate on the generator, assuming it is a subclass of
Sequence
:As you see, you need to iterate on the generator, computing predictions by batch, and on each batch you also get access to the true labels, so you can accumulate both on some lists, and then later combine them to compute your desired metric.