"IndexError: too many indices for array" when trying to count unique items in numpy array column

177 Views Asked by At

I have a numpy array called data which has 8 columns and gets recursively manipulated in the rows so it has a variable number of rows each time it passes through the function I need to apply.

Inside that function I have the following line of code which should count the occurrences of each unique value that appears in the last column of the array, whatever number of rows my array has at that point:

labels, counts = np.unique(data[:,-1], return_counts=True)

This line of code returns an IndexError: too many indices for array which I assume has to do with how I sliced the column, but I have no idea how to fix it. I have been googling and editing but nothing I tried seems to fix it. Help would be much appreciated. Thank you.

1

There are 1 best solutions below

0
On

Is there any chance that you have a structured array with mixed dtypes? The only way I could generate that error using your indexing method with with one. For example

a = np.arange(0, 8*5).reshape(5, 8)
a
array([[ 0,  1,  2,  3,  4,  5,  6,  7],
       [ 8,  9, 10, 11, 12, 13, 14, 15],
       [16, 17, 18, 19, 20, 21, 22, 23],
       [24, 25, 26, 27, 28, 29, 30, 31],
       [32, 33, 34, 35, 36, 37, 38, 39]])

a[:,-1]  # as a check... array([ 7, 15, 23, 31, 39])

np.unique(a[:, -1], return_counts=True)  # works as designed
(array([ 7, 15, 23, 31, 39]), array([1, 1, 1, 1, 1], dtype=int64))

# ---- a quick way to convert from uniform dtype to structured array
from numpy.lib.recfunctions import unstructured_to_structured as uts

b = uts(a)
b
array([( 0,  1,  2,  3,  4,  5,  6,  7), ( 8,  9, 10, 11, 12, 13, 14, 15),
       (16, 17, 18, 19, 20, 21, 22, 23), (24, 25, 26, 27, 28, 29, 30, 31),
       (32, 33, 34, 35, 36, 37, 38, 39)],
      dtype=[('f0', '<i4'), ('f1', '<i4'), ('f2', '<i4'), ('f3', '<i4'),
             ('f4', '<i4'), ('f5', '<i4'), ('f6', '<i4'), ('f7', '<i4')])

# ---- you can slice a structured array, you have to access it through it field
np.unique(b[:, -1], return_counts=True)

Traceback (most recent call last):
  File "<ipython-input-8-51ab6cec2618>", line 1, in <module>
    np.unique(b[:, -1], return_counts=True)
IndexError: too many indices for array: array is 1-dimensional, but 2 were indexed

# ---- to fix it, access the last field by its name
np.unique(b['f7'], return_counts=True)
(array([ 7, 15, 23, 31, 39]), array([1, 1, 1, 1, 1], dtype=int64))