Initial request for review of JDK-8006572 DoubleStream.sum()/average() implementations that reduce numerical errors
Brian Goetz
brian.goetz at oracle.com
Thu Nov 14 16:24:09 UTC 2013
You are right that collect() would be a good start, but sadly there is no DoubleCollector which trucks in doubles. So the concept is right but it doesn't quite get you all the way there. Looking one level down, you'll find a nicer answer. I would look at ReduceOps.makeDouble(), which ultimatley implements the double reduce. It should be obvious how to clone that to use Kahan summing in both the accept and combine methods. Then make sum() perform
return evaluate(ReduceOps.makeDoubleKahanSummer());
On Nov 14, 2013, at 9:08 AM, Joe Darcy wrote:
> Hello,
>
> Please take an initial look over a fix for
>
> JDK-8006572 DoubleStream.sum() & DoubleSummaryStats implementations that reduce numerical errors
> http://cr.openjdk.java.net/~darcy/8006572.0/
>
> The basic approach is to use compensated summation
>
> http://en.wikipedia.org/wiki/Kahan_summation_algorithm
>
> to computed streams-related sum and average statistics in the various locations that this can be done.
>
> All existing streams tests pass and new newly-written test passes too.
>
> I believe the DoubleSummaryStatistics.java portion, including the test, is fully review-worthy. In the test, for the sample computation in question, the naive summation implementation had a error of 500,000 ulps compared to 2 ups with the new implementation.
>
> Two other locations I've found where this summation technique should be used are in
>
> java.util.stream.Collectors.{summingDouble, averagingDouble}
>
> and
>
> java.util.stream.DoublePipeline.{sum, average}
>
> DoublePipeline is the primary implementation class of DoubleStream.
>
> For Collectors, the proposed code is a fairly clear adaptation of how the current code passes state around; there is not currently a dedicated test for the new summation technique in this location.
>
> I'm new to the streams API so for DoublePipeline I don't know the idiomatic way to phrase the collect I want to perform over the code. (Based on my current understanding, I believe I want to perform a collect rather than a reduce since for the compensated summation I need to maintain some additional state.) Guidance here welcome.
>
> Thanks,
>
> -Joe
More information about the core-libs-dev
mailing list