I am working on CIFAR-10 dataset and trying to understand how the model predicts the image (i.e. Explainable and Interpretation of the AI). One of the ways to do this is using occlusion sensitivity. I found this wonderful code on Occlusion Sensitivity on Kaggle. I am using the same code (but for 3 channels). I am finding it difficult to understand the heatmap in the link above. I understand the definition of occlusion sensitivity. It is pretty simple. Basically, the idea is to occlude the object in an image by overwriting it with a rectangular grey patch and then to test if the artificial intelligence could detect the object. In the link I have three problems:
print('Predicted class: {} (prob: {})'.format(np.argmax(pred), np.amax(out)))
In the above code, the np.amax(out) was used for the probability, but since the variable "out" is in the loop, np.amax(out) will return the maximum value of the last iteration of the for loop which might or might not have predicted the desired class. Should not this be replaced with np.amax(pred)? For example: If the correct class is 2 and the last iteration predicts class 5, np.amax(out) will return the probability of class 5 and not class 2. Am I missing something here or is my understanding of the code correct?
It is mentioned in the link
heatmap (low value = important field) and predicted-class map
Now, I understand that when the grey patch covers the important information in the image, the heatmap shows the darker shades which means that this part of the image when occluded decreases the probability of the class (correct class if the model has high accuracy) predicted by the model drastically. But I do not understand the predicted class-map. How in those area other classes are predicted? When there is a patch between 15-20 (on the y-axis), the probability of class 2 decreases, but how did the neural network predict class 8 and class 9 instead?
Now, I am in conflict on occlusion sensitivity really works:
- When the image is occluded with the grey patch, and when this image is given to the model, the model sees the entire image except for the grey patch area and gives the prediction based on what it sees (this is what I think it is, but it doesn't make sense when I look predicted class-map) or
- When the image is occluded with the grey patch, and when this image is given to the model, the model sees the image under the grey patch area only and gives the prediction based on what it sees in the grey patch area (it doesn't make sense to me, but with this, I can explain the predicted class-map)
Which of the above is correct?
I am using the ship image from the CIFAR-10 dataset (i.e. Class 8)
and the occluded images are
The outputs are:
Correct class: [8]
Predicted class: 8 (prob: 0.9992953538894653) #Here I have replaced np.amax(out) with np.amax(pred) (Doubt 1)
Predictions:
Class id 8: Count 8
Class id 0: Count 37
Class id 3: Count 2
Class id 1: Count 13
Class id 2: Count 4
The heat map and the predicted class-map
As you can see, the model predicts class 0 (when occluded) the most (Count is 37), but the model predicts class 8 (when not occluded i.e. when the entire image is given to the model). Why so?
Also, what kind of image is better to train the model? Greyscale image or the colour (3 channel: RGB) image
If anyone could help me to understand the predicted class-map, or provide me with a good link, then it would be perfect!
Note: The following links are already referred:
but unfortunately, I do not really get the core of occlusion sensitivity. I get the intuition (I guess)
Thank you very much!