I'm trying to calculate the mean square error in an image dataset(CIFAR-10). I have a numpy array
of dimension 5*10000*32*32*3
which is, in words, 5 batches of 10000 images each with dimensions of 32*32*3
. These images belong to 10 categories of images. I have calculated average of each class and now I'm trying to calculate the mean square error of each of the 50000 images wrt the 10 average images. Here is the code:
for i in range(0, 5):
for j in range(0, 10000):
min_diff, min_class = float('inf'), 0
for avg in class_avg: # avg class comprises of 10 average images
temp = mse(avg[1], images[i][j])
if temp < min_diff:
min_diff = temp
min_class = avg[0]
train_pred[i][j] = min_class
Problem: Is there any way to make it faster. Any numpy magic? Thank you.
You can use
expand_dims
andtile
.There are many ways of expanding the dimension of an array, I will use one of them, which is something like
[:,None,:]
, this adds a new axis in the middle.Below is an example of how you can combine the two methods to fulfill your task:
UPDATE
The purpose of using
expand_dims
andtile
is to avoid using afor-loop
. However, thenp.tile
operation will create 10 replicates of the original array, this will definitely hurt the performance if the array is large. To avoid usingnp.tile
, you can try the code below: