The documentation for scipy's binned_statistic_2d function gives an example for a 2D histogram:
from scipy import stats
x = [0.1, 0.1, 0.1, 0.6]
y = [2.1, 2.6, 2.1, 2.1]
binx = [0.0, 0.5, 1.0]
biny = [2.0, 2.5, 3.0]
ret = stats.binned_statistic_2d(x, y, None, 'count', bins=[binx, biny])
Makes sense, but I'm now trying to implement a custom function. The custom function description is given as:
function : a user-defined function which takes a 1D array of values, and outputs a single numerical statistic. This function will be called on the values in each bin. Empty bins will be represented by function([]), or NaN if this returns an error.
I wasn't sure exactly how to implement this, so I thought I'd check my understanding by writing a custom function that reproduces the count option. I tried
def custom_func(values):
return len(values)
x = [0.1, 0.1, 0.1, 0.6]
y = [2.1, 2.6, 2.1, 2.1]
binx = [0.0, 0.5, 1.0]
biny = [2.0, 2.5, 3.0]
ret = stats.binned_statistic_2d(x, y, None, custom_func, bins=[binx, biny])
but this generates an error like so:
556 # Make sure `values` match `sample`
557 if(statistic != 'count' and Vlen != Dlen):
558 raise AttributeError('The number of `values` elements must match the '
559 'length of each `sample` dimension.')
561 try:
562 M = len(bins)
AttributeError: The number of `values` elements must match the length of each `sample` dimension.
How is this custom function supposed to be defined?
The reason for this error is that when using a custom statistic function (or any non-
countstatistic), you have to pass some array or list of arrays to thevaluesparameter (with the number of elements matching the number inx). You can't just leave it asNoneas in your example, even though it is irrelevant and does not get used when computing counts of data points in each bin.So, to match the results, you can just pass the same
xobject to thevaluesparameter:The result matches that of the
countstatistic: