About pandas groupby+rolling+apply

107 Views Asked by At

I failed in my attempt to group Dataframes by 'tag', expecting to turn their rolling 4 'sales' into a list, my code:

df = pd.DataFrame({
'sales': [10, 20, 30, 40, 50, 60, 70, 80, 90, 100],
'tag': ['A', 'A', 'A', 'B', 'B', 'B', 'B', 'C', 'C', 'C']})

df.groupby('tag')['sales'].rolling(4).apply(lambda x: x.tolist())

The error message is as follows: TypeError: must be real number, not list

1

There are 1 best solutions below

0
inquirer On

I can suggest the following option: the data is aggregated into a list and a slice [:4] is taken, that is, take the first four values, if there are less than 4, then return np.nan.

import numpy as np
import pandas as pd

amount = 4

data = (df.groupby('tag')['sales'].
        agg(lambda x: list(x) if len(x) >= amount else np.nan))

print(data)

Output:

tag
A                 NaN
B    [40, 50, 60, 70]
C                 NaN

if you need a dataframe:

data = (df.groupby('tag', as_index=False)['sales'].
        agg(lambda x: list(x) if len(x) >= amount else np.nan))

Output:

  tag             sales
0   A               NaN
1   B  [40, 50, 60, 70]
2   C               NaN

exclude empty rows:

data = (df.groupby('tag', as_index=False)['sales'].
        agg(lambda x: list(x) if len(x) >= amount else np.nan)).dropna()

Output:

  tag             sales
1   B  [40, 50, 60, 70]