Initial request for review of JDK-8006572 DoubleStream.sum()/average() implementations that reduce numerical errors
Joe Darcy
joe.darcy at oracle.com
Sat Nov 23 07:06:03 UTC 2013
Hello Mike and Paul,
Thanks for reviewing the earlier versions of the change; I've posted an
update revision for your consideration at
http://cr.openjdk.java.net/~darcy/8006572.4/
Two numerical aspects of the change are different in this iteration:
* when two running compensated sums and combined, the compensation
portion of the second is now added in first. Since it should have
smaller magnitude, this is favorable in terms of error analysis
* the final sum returned (or used to compute the average) is (sum +
sumCompensation). I checked one of my references, "Accuracy and
Stability of Numerical Algorithms" from Nicholas Higham, and that tweak
gave a better error bound
The test code has been updated as you've suggested (other than porting
to TestNG ;-) and I also added a test case for a bug Paul spotted on an
earlier review.
All the streams tests pass with the new code.
Thanks,
-Joe
On 11/20/2013 11:37 AM, Joe Darcy wrote:
> Hi Mike,
>
> On 11/20/2013 10:31 AM, Mike Duigou wrote:
>> - Collectors averagingDouble could use the same @implNote as in
>> DoublePipeline.
>>
>> - DoublePipeline implementation could document the usage of the
>> double array indices.
>>
>> - @summary in test.
>
> I'll reflect these in the next iteration.
>
>>
>> - I agree with Paul that refactoring as a testNG test would be nice.
>
> While learning about testNG would be useful, I don't want to take that
> on right at the moment ;-)
>
>>
>> - I wondered at the mechanism for combining compensation values. Do
>> you have a reference which explains the correctness? I thought
>> perhaps directly adding the compensations together before doing
>> compensated addition of the two sums might be better.
>
> One of the inquiries I have out to my numerical reviewer is how to
> best combine two compensations.
>
> In the code as written, from the perspective of one compensation, the
> two doubles in the other other compensation are just two more double
> values. So I think this is correct, if not ideal.
>
> The compensation portion of one running sum could be of vastly
> different magnitude than the compensation portion (or the high-order
> sum bits) of the other sum. Therefore, I believe a direct combination
> of either (sum1 + sum2) or (comp1 + comp2) would lose the numerical
> properties of interest here.
>
> Thanks for the feedback,
>
> -Joe
>
>>
>> Mike
>>
>>
>> On Nov 20 2013, at 00:21 , Joe Darcy <joe.darcy at oracle.com> wrote:
>>
>>> Hi Paul,
>>>
>>> On 11/18/2013 07:38 AM, Paul Sandoz wrote:
>>>> Hi Joe,
>>>>
>>>> You can use the three arg form of collect for
>>>> DoublePipeline.sum/average impls, which is already used for average:
>>>>
>>>> public final OptionalDouble average() {
>>>> double[] avg = collect(() -> new double[2],
>>>> (ll, i) -> {
>>>> ll[0]++;
>>>> ll[1] += i;
>>>> },
>>>> (ll, rr) -> {
>>>> ll[0] += rr[0];
>>>> ll[1] += rr[1];
>>>> });
>>>> return avg[0] > 0
>>>> ? OptionalDouble.of(avg[1] / avg[0])
>>>> : OptionalDouble.empty();
>>>> }
>>>>
>>>> That would be the most expedient way.
>>> Thanks for the tip. For the moment, I'm feeling a bit expedient and
>>> used the array-based approach in an iteration of the change:
>>>
>>> http://cr.openjdk.java.net/~darcy/8006572.3/
>>
>
More information about the core-libs-dev
mailing list