How to count size of groups using xarray?

2.3k Views Asked by At

I'd like to count the size of groups after grouping using groupby(), i.e. the number of occurrences of some value. Using pandas this can be done using GroupBy.size():

>>> pd.DataFrame({'my_column': [1, 1, 1, 2, 2, 3]}).groupby(by='my_column').size()                                                  
my_column
1    3
2    2
3    1
dtype: int64

Numpy supports something similar using np.unique():

>>> np.unique([1, 1, 1, 2, 2, 3], return_counts=True)[1]                                                                            
array([3, 2, 1])

Using xarray I can find only very awkward ways to achieve the same, e.g. converting the DataArray object to a Pandas DataFrame:

>>> d = xr.DataArray([1, 1, 1, 2, 2, 3], name='my_column')
>>> d.to_dataframe().groupby(by='my_column').size()                                                                         
my_column
1    3
2    2
3    1
dtype: int64

...or do very unreadable things like:

>>> xr.ones_like(d).groupby(d).sum(dim='dim_0')                                                                                    
<xarray.DataArray 'my_column' (my_column: 3)>
array([3, 2, 1])
Coordinates:
  * my_column  (my_column) int64 1 2 3

Is there a better way to get a reduced DataArray object with correct coordinates and dimensions? Is there reason for not introducing a DataArrayGroupBy.size() method similar to Pandas?

(I was using xarray version 0.15.0 when writing this question.)

1

There are 1 best solutions below

1
On BEST ANSWER

The answer here is to use GroupBy.count():

>>> d = xr.DataArray([1, 1, 1, 2, 2, 3], name='my_column')                                                                          
>>> d.groupby(d).count()                                                                                                            
<xarray.DataArray 'my_column' (my_column: 3)>
array([3, 2, 1])
Coordinates:
  * my_column  (my_column) int64 1 2 3