Initial request for review of JDK-8006572 DoubleStream.sum()/average() implementations that reduce numerical errors

Joe Darcy joe.darcy at oracle.com
Mon Dec 2 07:34:09 UTC 2013


Hello,

I've heard back from my numerical reviewer. Given the signs of the 
terms, he recommend when two partial sums are combined, the leading word 
be added in first. I've reflected this change in the revision I'm about 
to push:

     http://cr.openjdk.java.net/~darcy/8006572.5

He had some suggestions for further numerical improvements, which I've 
captured in

         JDK-8029372 Improve combination of partial sums/averages in 
streams API
         https://bugs.openjdk.java.net/browse/JDK-8029372

Thanks,

-Joe

On 11/25/2013 09:57 AM, Mike Duigou wrote:
> Looks good
>
> Mike
>
> On Nov 22 2013, at 23:06 , Joe Darcy <joe.darcy at oracle.com> wrote:
>
>> Hello Mike and Paul,
>>
>> Thanks for reviewing the earlier versions of the change; I've posted an update revision for your consideration at
>>
>>     http://cr.openjdk.java.net/~darcy/8006572.4/
>>
>> Two numerical aspects of the change are different in this iteration:
>>
>>     * when two running compensated sums and combined, the compensation portion of the second is now added in first. Since it should have smaller magnitude, this is favorable in terms of error analysis
>>
>>     * the final sum returned (or used to compute the average) is (sum + sumCompensation). I checked one of my references, "Accuracy and Stability of Numerical Algorithms" from Nicholas Higham, and that tweak gave a better error bound
>>
>> The test code has been updated as you've suggested (other than porting to TestNG ;-) and I also added a test case for a bug Paul spotted on an earlier review.
>>
>> All the streams tests pass with the new code.
>>
>> Thanks,
>>
>> -Joe
>>
>> On 11/20/2013 11:37 AM, Joe Darcy wrote:
>>> Hi Mike,
>>>
>>> On 11/20/2013 10:31 AM, Mike Duigou wrote:
>>>> - Collectors averagingDouble could use the same @implNote as in DoublePipeline.
>>>>
>>>> - DoublePipeline implementation could document the usage of the double array indices.
>>>>
>>>> - @summary in test.
>>> I'll reflect these in the next iteration.
>>>
>>>> - I agree with Paul that refactoring as a testNG test would be nice.
>>> While learning about testNG would be useful, I don't want to take that on right at the moment ;-)
>>>
>>>> - I wondered at the mechanism for combining compensation values. Do you have a reference which explains the correctness? I thought perhaps directly adding the compensations together before doing compensated addition of the two sums might be better.
>>> One of the inquiries I have out to my numerical reviewer is how to best combine two compensations.
>>>
>>> In the code as written, from the perspective of one compensation, the two doubles in the other other compensation are just two more double values. So I think this is correct, if not ideal.
>>>
>>> The compensation portion of one running sum could be of vastly different magnitude than the compensation portion (or the high-order sum bits) of the other sum. Therefore, I believe a direct combination of either (sum1 + sum2) or (comp1 + comp2) would lose the numerical properties of interest here.
>>>
>>> Thanks for the feedback,
>>>
>>> -Joe
>>>
>>>> Mike
>>>>
>>>>
>>>> On Nov 20 2013, at 00:21 , Joe Darcy <joe.darcy at oracle.com> wrote:
>>>>
>>>>> Hi Paul,
>>>>>
>>>>> On 11/18/2013 07:38 AM, Paul Sandoz wrote:
>>>>>> Hi Joe,
>>>>>>
>>>>>> You can use the three arg form of collect for DoublePipeline.sum/average impls, which is already used for average:
>>>>>>
>>>>>>      public final OptionalDouble average() {
>>>>>>          double[] avg = collect(() -> new double[2],
>>>>>>                                 (ll, i) -> {
>>>>>>                                     ll[0]++;
>>>>>>                                     ll[1] += i;
>>>>>>                                 },
>>>>>>                                 (ll, rr) -> {
>>>>>>                                     ll[0] += rr[0];
>>>>>>                                     ll[1] += rr[1];
>>>>>>                                 });
>>>>>>          return avg[0] > 0
>>>>>>                 ? OptionalDouble.of(avg[1] / avg[0])
>>>>>>                 : OptionalDouble.empty();
>>>>>>      }
>>>>>>
>>>>>> That would be the most expedient way.
>>>>> Thanks for the tip. For the moment, I'm feeling a bit expedient and used the array-based approach in an iteration of the change:
>>>>>
>>>>>     http://cr.openjdk.java.net/~darcy/8006572.3/




More information about the core-libs-dev mailing list