How to remove zero dimension from numpy array?

75 Views Asked by At

I have a dataset of shape (100000, 1024, 9) which is an MCMC chain from emcee. I try to thin each chain by 10 and retain the last 2000 of each, ie take chains[:, ::10, :][:, 2000:, :]. This gives me an array of shape (100000, 0, 9). I want to flatten this to have an array (100000, 9) but anything I've tried does not work, giving the error:

IndexError: index 0 is out of bounds for axis 1 with size 0

I've tried using basic index slicing, and various numpy functions (squeeze, take, take_along_axis, delete) but they all result in the same error. I tried doing it manually from a for loop but same error. It seems this zero dimension is stubborn.

I can recreate the error by making an array,

a = np.zeros((100000,0,9))

from which I cannot seem to remove the 0 dimension.

How to solve this? Thanks.

1

There are 1 best solutions below

0
Oskar Hofmann On

When your numpy-array has a dimension of size 0, something went wrong with your code (logically). A numpy array like this can exist but it is essentially an empty array:

a = np.zeros((3,0,2))
print(a)

Output:

[]

You cannot reshape that and just leave out that size-0 dimension.

Regarding your example: Your second dimension only contains 1024 entries (in your example). Why do you want to keep every 10th entry (this works fine) and then only the last 2000? You already start with less than 2000 in that dimension and after thinning you have 103 entries. Just leave out the selection of the last 2000. If you may have more entries, just check if your 2nd dimension still has more than 2000 entries.

Another tip: The slice [2000:] does not give you the last 2000 entries but all entries including and beyond index 2000. To get the last 2000, you can use negative indexing (counting from the end) [-2000:]:

thinned_chain = chain[:,::10,:]
if thinned_chain.shape[1] > 2000:
    thinned_chain = thinned_chain[:,-2000:,:]