recarray with lists: how to reference first element in list

590 Views Asked by At

I want to copy contents of a few fields in a record array into a ndarray (both type float64). I know how to do this when the recarray data has a single value in each field:

my_ndarray[:,0]=my_recarray['X']  #(for field 'X')

Now I have a recarray with a list of 5 floats in each field, and I only want to copy the first element of each list. When I use the above with the new recarray (and list), I get this error:

ValueError: could not broadcast input array from shape (92,5) into shape (92)

That makes total sense (in hindsight).

I thought I could get just the first element of each with this:

my_ndarray[:,0]=my_recarray['X'][0]  #(for field 'X')

I get this error:

ValueError: could not broadcast input array from shape (5) into shape (92)

I sorta understand...numpy is only taking the first row (5 elements) and trying to broadcast into a 92 element column.

So....now I'm wondering how to get the first element of each list down the 92 element column, Scratchin my head.... Thanks in advance for advice.

1

There are 1 best solutions below

2
hpaulj On BEST ANSWER

My guess is that the recarray has a dtype where one of the fields has shape 5:

In [48]: dt = np.dtype([('X',int,5),('Y',float)])
In [49]: arr = np.zeros(3, dtype=dt)
In [50]: arr
Out[50]: 
array([([0, 0, 0, 0, 0], 0.), ([0, 0, 0, 0, 0], 0.),
       ([0, 0, 0, 0, 0], 0.)], dtype=[('X', '<i8', (5,)), ('Y', '<f8')])

Accessing this field by name produces an array that is (3,5) shape (analogous to your (92,5):

In [51]: arr['X']
Out[51]: 
array([[0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0]])

This could be described as a list of 5 items for each record, but indexing with field name produces a 2d array, which can be indexing like any 2d numpy array.

Let's set those values to something interesting:

In [52]: arr['X'] = np.arange(15).reshape(3,5)
In [53]: arr
Out[53]: 
array([([ 0,  1,  2,  3,  4], 0.), ([ 5,  6,  7,  8,  9], 0.),
       ([10, 11, 12, 13, 14], 0.)],
      dtype=[('X', '<i8', (5,)), ('Y', '<f8')])

We can fetch the first column of this field with:

In [54]: arr['X'][:,0]
Out[54]: array([ 0,  5, 10])

If you have several fields with a structure like this, you'll probably have to access each one by name. There's a limit to what you can do with multi-field indexing.