How can I use Ibis to fill missing values with the mean?
For example, if I have this data:
import pandas as pd
import ibis
from ibis import _
ibis.options.interactive = True
df = pd.DataFrame(data={'fruit': ['apple', 'apple', 'apple', 'orange', 'orange', 'orange'],
'variety': ['gala', 'honeycrisp', 'fuji', 'navel', 'valencia', 'cara cara'],
'weight': [134 , 158, pd.NA, 142, 96, pd.NA]})
t = ibis.memtable(df)
Using Ibis code:
- How would I replace the
NAvalues in theweightcolumn with the overall mean ofweight? - How would I replace the
NAvalues in theweightcolumn with the the mean within each group (apples, oranges)?
In the first case (replacing
NULLwith overall mean) you can simply pass the mean of the replacement column tofillnaand ibis will figure out what you mean:In the second case of replacing the nulls per group, you can use a window function: