How to define custom function for scipy's binned_statistic_2d?

206 Views Asked by gammapoint At 19 September 2022 at 17:56

The documentation for scipy's binned_statistic_2d function gives an example for a 2D histogram:

from scipy import stats
x = [0.1, 0.1, 0.1, 0.6]
y = [2.1, 2.6, 2.1, 2.1]
binx = [0.0, 0.5, 1.0]
biny = [2.0, 2.5, 3.0]
ret = stats.binned_statistic_2d(x, y, None, 'count', bins=[binx, biny])

Makes sense, but I'm now trying to implement a custom function. The custom function description is given as:

function : a user-defined function which takes a 1D array of values, and outputs a single numerical statistic. This function will be called on the values in each bin. Empty bins will be represented by function([]), or NaN if this returns an error.

I wasn't sure exactly how to implement this, so I thought I'd check my understanding by writing a custom function that reproduces the count option. I tried

def custom_func(values):
    return len(values)
x = [0.1, 0.1, 0.1, 0.6]
y = [2.1, 2.6, 2.1, 2.1]
binx = [0.0, 0.5, 1.0]
biny = [2.0, 2.5, 3.0]
ret = stats.binned_statistic_2d(x, y, None, custom_func, bins=[binx, biny])

but this generates an error like so:

556 # Make sure `values` match `sample`
557 if(statistic != 'count' and Vlen != Dlen):
558     raise AttributeError('The number of `values` elements must match the '
559                          'length of each `sample` dimension.')
561 try:
562     M = len(bins)

AttributeError: The number of `values` elements must match the length of each `sample` dimension.

How is this custom function supposed to be defined?

Original Q&A

There are 1 best solutions below

AlexK On 20 September 2022 at 01:22 BEST ANSWER

The reason for this error is that when using a custom statistic function (or any non-count statistic), you have to pass some array or list of arrays to the values parameter (with the number of elements matching the number in x). You can't just leave it as None as in your example, even though it is irrelevant and does not get used when computing counts of data points in each bin.

So, to match the results, you can just pass the same x object to the values parameter:

def custom_func(values):
    return len(values)

x = [0.1, 0.1, 0.1, 0.6]
y = [2.1, 2.6, 2.1, 2.1]
binx = [0.0, 0.5, 1.0]
biny = [2.0, 2.5, 3.0]

ret = stats.binned_statistic_2d(x, y, x, custom_func, bins=[binx, biny])

print(ret)
# BinnedStatistic2dResult(statistic=array([[2., 1.],
#        [1., 0.]]), x_edge=array([0. , 0.5, 1. ]), y_edge=array([2. , 2.5, 3. ]), binnumber=array([5, 6, 5, 9]))

The result matches that of the count statistic:

ret = stats.binned_statistic_2d(x, y, None, 'count', bins=[binx, biny])

print(ret)
# BinnedStatistic2dResult(statistic=array([[2., 1.],
#        [1., 0.]]), x_edge=array([0. , 0.5, 1. ]), y_edge=array([2. , 2.5, 3. ]), binnumber=array([5, 6, 5, 9]))

How to define custom function for scipy's binned_statistic_2d?

There are 1 best solutions below

Related Questions in PYTHON-3.X

Related Questions in SCIPY

Related Questions in SCIPY.STATS

Related Questions in HISTOGRAM2D

Trending Questions

Popular # Hahtags

Popular Questions