Filter rows and select columns based on NA / null percentage

160 Views Asked by At

I've been checking the other PyPolars questions and didn't find the answer for it.

Is there a more idiomatic way for:

being df a Polars DataFrame

  • filtering rows based on NA percentage:
max_nas_perc = 0.6
df.filter(pl.sum_horizontal(pl.all().is_null() / pl.all().count()) <= max_nas_perc)
  • filtering / selecting columns based on NA percentage:
df.select(
    [
        column for i, column in 
        enumerate(df_nulls.columns) if df_nulls.select(pl.all().is_null().sum() / pl.all().count() <= max_nas_perc).to_numpy()[0][i]
    ]
)

Should we add it to the drop_null() list of examples? Has it been discussed/discarded as a feature request?

0

There are 0 best solutions below