Initial request for review of JDK-8006572 DoubleStream.sum()/average() implementations that reduce numerical errors
Joe Darcy
joe.darcy at oracle.com
Thu Nov 14 08:08:19 UTC 2013
Hello,
Please take an initial look over a fix for
JDK-8006572 DoubleStream.sum() & DoubleSummaryStats implementations
that reduce numerical errors
http://cr.openjdk.java.net/~darcy/8006572.0/
The basic approach is to use compensated summation
http://en.wikipedia.org/wiki/Kahan_summation_algorithm
to computed streams-related sum and average statistics in the various
locations that this can be done.
All existing streams tests pass and new newly-written test passes too.
I believe the DoubleSummaryStatistics.java portion, including the test,
is fully review-worthy. In the test, for the sample computation in
question, the naive summation implementation had a error of 500,000 ulps
compared to 2 ups with the new implementation.
Two other locations I've found where this summation technique should be
used are in
java.util.stream.Collectors.{summingDouble, averagingDouble}
and
java.util.stream.DoublePipeline.{sum, average}
DoublePipeline is the primary implementation class of DoubleStream.
For Collectors, the proposed code is a fairly clear adaptation of how
the current code passes state around; there is not currently a dedicated
test for the new summation technique in this location.
I'm new to the streams API so for DoublePipeline I don't know the
idiomatic way to phrase the collect I want to perform over the code.
(Based on my current understanding, I believe I want to perform a
collect rather than a reduce since for the compensated summation I need
to maintain some additional state.) Guidance here welcome.
Thanks,
-Joe
More information about the core-libs-dev
mailing list