I have a pandas dataframe like this
group cat
1 0
2 0
1 0
1 1
2 0
2 1
1 2
1 2
I'm trying to group the data by group, then applies a custom function to the past 5 rows.
The custom function looks like this
def unalikeability(data):
num_observations = data.shape[0]
counts = data.value_counts()
return 1 - ((counts / num_observations)**2).sum()
Desired output:
group unalikeability
1 result calculated by the function
1
1
1
2
2
2
2
I can get the past 5 rows using groupby().rolling(), but the rolling object in pandas doesn't have the shape/ value_counts attribute and method like a DataFrame. I tried creating a DataFrame from the rolling object, but this isn't allowed either.
You can
applyyour function. Depending on whether you want the output to be computed only on full chunks (5 values), or chunks of any size, usemin_periods:Output: