How can I compute and broadcast a count in pandas?
To compute a count:
df.groupby('field').size()
To broadcast an aggregation to the original dataframe:
df.groupby('field')['field_to_aggregate'].transform(aggregation)
The latter works if I specify the field to aggregate onto and aggregations like sum
, mean
, etc. But I am not finding a way to make it work when I want a simple count of the grouped-by field.
(Note: I could just use the first and re-join on the original table against the grouped-by table, but I want to avoid joins and I'm looking for an efficient solution that uses pandas' transform
)
You could try:
Note that
'field_to_aggregate'
can be the same as'field'
.