only output columns with missing values and the percetage of it

139 Views Asked by Ngoc ng At 08 September 2021 at 08:26

I used this code to find the percentage of missing values in each column:

     df.isnull().mean().round(4).mul(100).sort_values(ascending=False)

The result shows all columns, which makes the output looks very long if you have a lot of columns and only need to know the percentage of the missing value ones. I would like to know how to remove all the columns that have no missing values for this code. Any other solutions are welcomed

Thanks for any answer.

Original Q&A

There are 2 best solutions below

mozway On 08 September 2021 at 08:33

You can save your nulls in an intermediate dataframe and use it for slicing:

nulls = df.isnull()
nulls.loc[:, nulls.any()].mean().round(4).mul(100).sort_values(ascending=False)

example input:

     a    b  c
0  0.0  1.0  2
1  3.0  NaN  4
2  NaN  5.0  6

output:

b    33.33
a    33.33

Daweo On 08 September 2021 at 08:43

You might simply filter what you get to only include columns which have more than 0 missing value, consider following simple example

import pandas as pd
df = pd.DataFrame({'a':[1.0,2.0,3.0],'b':[4.0,float("nan"),float("nan")],'c':[float("nan"),float("nan"),float("nan")]})
means = df.isnull().mean()
percentages = means[means>0].round(4).mul(100).sort_values(ascending=False)
print(percentages)

output

c    100.00
b     66.67
dtype: float64

only output columns with missing values and the percetage of it

There are 2 best solutions below

Related Questions in PYTHON

Related Questions in PANDAS

Related Questions in DATA-CLEANING

Related Questions in ISNULL

Related Questions in TARGETNULLVALUE

Trending Questions

Popular # Hahtags

Popular Questions