Why does select raise a FutureWarning?

194 Views Asked by At

In my code I have a 2D numpy.ndarray filled with numpy.str_ values. I'm trying to change values "null" to "nan" using the select method. The problem is that this method raises a FutureWarning.

I have read this. On a suggestion there I tried to not compare Python strings a Numpy strings, but convert Python string to Numpy string at the start. Obviously that doesn't help and I'm looking for an advice.

I would like to avoid shutting down the warning (as it is in the link). It seems to me like a very dirty approach.

My code snippet:

import pandas_datareader as pd
import numpy as np
import datetime as dt


start_date = dt.datetime(year=2013, month=1, day=1)
end_date = dt.datetime(year=2013, month=2, day=1)
df = pd.DataReader("AAA", "yahoo", start_date, end_date + dt.timedelta(days=1))
array = df.to_numpy()

null = np.str_("null")
nan = np.str_("nan")
array = np.select([array == null, not array == null], [nan, array])
print(array[0][0].__class__)
print(null.__class__)
C\Python\Project.py:13: FutureWarning: elementwise comparison failed; returning scalar instead, but in the future will perform elementwise comparison
  array = np.select([array == null, not array == null], [nan, array])
<class 'numpy.str_'>
<class 'numpy.str_'>

I'm quite new to Python so every help will be appreciated. And also - if you have a better way how to achieve that, please let me know.

Thank you!

Edit: Sorry for that. Now it should work as it is.

1

There are 1 best solutions below

0
On

I don't have 50 reputation yet, so I can't comment..

As I understand it you only want to change al 'null'-entries to 'nan' instead?

Your code creates a Numpy Array of float-values, but for some reason you expect strings of 'null' in the array? Perhaps you should've written

array = df.to_numpy()
array = array.astype(str)

to make it more clear.

From here, the array consists only of strings, and to make the change from 'null' to 'nan', you only have to write

array[array == 'null'] = 'nan'

and the warning is gone. You don't even have to use np.select.

If you want floating-point values in your array, you could use Numpy's own np.nan instead of a string, and do

array = array.astype(float)

The nan-strings are automatically converted to np.nan, which is seen as a float.