Odd behavior of np.argsort with Pandas

61 Views Asked by At

Here is np.argsort applied in four different ways.

print(np.argsort([1,np.nan,3,np.nan, 4]))
print(np.argsort(pd.DataFrame([[1,np.nan,3,np.nan, 4]])).values)
print(np.argsort(pd.Series([1,np.nan,3,np.nan, 4]).values))  # same as first
print(np.argsort(pd.Series([1,np.nan,3,np.nan, 4])).values)

Output:

[0 2 4 1 3]
[[0 2 4 1 3]]
[0 2 4 1 3]
[ 0 -1  1 -1  2]

This is very unexpected behavior. No mention of it in numpy (obviously it will not mention Pandas).

In the Pandas documentation you can find

Returns: Series[np.intp]
Positions of values within the sort order with -1 indicating nan values.

Why? What would be a place where we would want this kind of behavior?

0

There are 0 best solutions below