python array initialisation (preallocation) with nans

3.1k Views Asked by At

I want to initialise an array that will hold some data. I have created a random matrix (using np.empty) and then multiplied it by np.nan. Is there anything wrong with that? Or is there a better practice that I should stick to?

To further explain my situation: I have data I need to store in an array. Say I have 8 rows of data. The number of elements in each row is not equal, so my matrix row length needs to be as long as the longest row. In other rows, some elements will not be filled. I don't want to use zeros since some of my data might actually be zeros.

I realise I can use some value I know my data will never, but nans is definitely clearer. Just wondering if that can cause any issues later with processing. I realise I need to use nanmax instead of max and so on.

2

There are 2 best solutions below

1
On

I have created a random matrix (using np.empty) and then multiplied it by np.nan. Is there anything wrong with that? Or is there a better practice that I should stick to?

You can use np.full, for example:

np.full((100, 100), np.nan)

However depending on your needs you could have a look at numpy.ma for masked arrays or scipy.sparse for sparse matrices. It may or may not be suitable, though. Either way you may need to use different functions from the corresponding module instead of the normal numpy ufuncs.

0
On

A way I like to do it which probably isn't the best but it's easy to remember is adding a 'nans' method to the numpy object this way:

import numpy as np
def nans(n):
    return np.array([np.nan for i in range(n)])

setattr(np,'nans',nans)

and now you can simply use np.nans as if it was the np.zeros:

np.nans(10)