cannot reshape array of size 1934 into shape (3,1)

836 Views Asked by At

I want to build my own PCA in python for the dataset having shape of (1934,32). Numpy array(binary image file). In the PCA I need to calculate the scatter matrix. I have a code, that works fine on images and an array of sizes (3,x). but doesn't work on mine.

I tried reshaping the np.zeros and reshape method to 32 and 1934, but nothing works. Here's a code glimpse what I'm using right now

for i in range(X.shape[1]):
    scatter_matrix += (X[:,i].reshape(3,1) - mean_vector).dot((X[:,i].reshape(3,1) - mean_vector).T)
print('Scatter Matrix:\n', scatter_matrix)

The error is "Cannot convert an array of size 1934 into shape (3,1)"

1

There are 1 best solutions below

0
On

I found a solution by adding a scatter matrix of dimension (1934,1934) instead of (3,1). And it's working fine for now. The code looks like below

scatter_matrix = np.zeros((1934,1934))
for i in range(X.shape[1]):
  print('first',i)
    A = X[:,i].reshape(1934,1) - mean
    #print(A)
    B = (X[:,i].reshape(1934,1) - mean).T
    #print(B)
    sb = A.dot(B)
    print(sb)
    #scatter_matrix += (A).dot(B)
    #print(i)
print('Scatter Matrix:\n', scatter_matrix)

But, now I am stuck with the dot product computation in the above code. It's taking too much time even on the Kaggle GPU environment. I cannot even get the result for a single iteration over the dataset.

Is there any solution available to make it faster?