How to get Apache commons math SummaryStatistics to use a better minimum?

99 Views Asked by At

org.apache.commons.math4.stat.descriptive.SummaryStatistics

SummaryStatistics appears to use a naive implementation of min(). It uses the default constructor of the internal container which defaults to a zero value. If your data set is greater than zero the statistics will never represent the true minimum.

I'm hoping there is a way to initialize it with a known value to avoid this, but I am not seeing that. Is there a way around this without using my own implementation for statistics?

thanks

1

There are 1 best solutions below

0
On

SummaryStatitics uses the Min univariate statistic object to compute minimums.

Based on the implementation for the 3.6.1 release, Min is initialized to Double.NaN.

When adding new values to SummaryStatistics, Min checks if a new value is less than the current minimum as well as checks if the current minimum is Double.NaN. If either of those conditions is true, the new value becomes the minimum.

In short, SummaryStatistics correctly computes the minimum even when all added values are positive.

As example:

SummaryStatistics summary = new SummaryStatistics();

System.out.println("Initial Minimum (should be NaN):     " + summary.getMin());
summary.addValue(10.0);
System.out.println("First Value Minimum (should be 10):  " + summary.getMin());
summary.addValue(5.0);
System.out.println("Smaller Value Minimum (should be 5): " + summary.getMin());
summary.addValue(20.0);
System.out.println("Larger Value Minimum (should be 5):  " + summary.getMin());

generates the following output:

Initial Minimum (should be NaN):     NaN
First Value Minimum (should be 10):  10.0
Smaller Value Minimum (should be 5): 5.0
Larger Value Minimum (should be 5):  5.0