Strange Error when adding 2 numpy masked arrays

173 Views Asked by At

So i'm trying to add 2 numpy masked arrays together. The difficulty is that they have to be added as strings because im trying to get a binary code in the resulting output array. The code below is a simplified version of what i'm trying to do. The mask for both arrays will be the same (In practice these would be way larger arrays, but the idea is the same):

a = np.zeros((3,3))
b = np.ones((3,3))
amask = [[False,True,True],[True, True, False],[False, False , True]]
bmask = [[False,True,True],[True, True, False],[False, False , True]]

a = a.astype('str')
b= b.astype('str')

am = ma.masked_array(a,mask = amask)
bm = ma.masked_array(b, mask = bmask)
x = np.add(am,bm)

I would like the output to be something like :

[['01' -- --],[-- -- '01'],['01', '01' --]]

So it's very important for it to be strings, so they can be added as such.

Running this code however gives me the following error:

numpy.core._exceptions.UFuncTypeError: ufunc 'add' did not contain a loop with signature matching types (dtype('<U32'), dtype('<U32')) -> None

Which I don't understand since both arrays clearly have the same datatypes in my opinion. Adding them without the string conversion works just fine but doesn't give me the required output. I have run into this error before and tried to look it up but never really understood it. Thanks

1

There are 1 best solutions below

0
On

This isn't a masked array issue; it's a string dtype one.

In [254]: a = np.arange(4)
In [255]: a
Out[255]: array([0, 1, 2, 3])
In [256]: a+a
Out[256]: array([0, 2, 4, 6])
In [257]: a1 = a.astype(str)
In [258]: a1
Out[258]: array(['0', '1', '2', '3'], dtype='<U21')
In [259]: a1 + a1
Traceback (most recent call last):
  Input In [259] in <cell line: 1>
    a1 + a1
UFuncTypeError: ufunc 'add' did not contain a loop with signature matching types (dtype('<U21'), dtype('<U21')) -> None

astype(str) makes an array with a numpy string dtype; this is optimized for array storage, but is not the same as Python strings. np.char has some functions that can apply string methods to Un dtypes:

In [260]: np.char.add(a1,a1)
Out[260]: array(['00', '11', '22', '33'], dtype='<U42')

Or as commented, you can make a list like array of string objects:

In [261]: a2 = a1.astype(object)
In [262]: a2
Out[262]: array(['0', '1', '2', '3'], dtype=object)
In [263]: a2 + a2
Out[263]: array(['00', '11', '22', '33'], dtype=object)

For object dtype arrays, operators like + delegate the action to the methods of the elements. Equivalently:

In [264]: [i+j for i,j in zip(a2,a2)]
Out[264]: ['00', '11', '22', '33']

I expect [264] to be fastest. numpy doesn't add much to string processing.