I am using the standard Boston houses data frame with pandas and I have noticed something that bugs me:
when I'm checking for missing values in 2 different ways - I'm getting 2 different results, though it shouldn't be.
Any ideas why this is happening?
Here's my code:
# loading df
from sklearn.datasets import load_boston
boston=load_boston()
boston_data = pd.DataFrame(data=boston.data, columns=boston.feature_names)
boston_data['price']=boston.target # the price column
Now if I run this code:
pd.isnull(boston_data).any()
this is the outcome:
CRIM False
ZN False
INDUS False
CHAS False
NOX False
RM False
AGE False
DIS False
RAD False
TAX False
PTRATIO False
B False
LSTAT False
dtype: bool
However, if I run it like this:
any(boston_data.isnull())
it returns: True
Why?..
pd.isnull(boston_data).any()checks for missing values across columns and returns False for all columns in your caseany(boston_data.isnull())checks for missing values across all columns and returns True because there is at least one missing value in DF