JDK 8 code review request forJDK-8024354: Explicitly permit DoubleStream.sum()/average() implementations to use higher precision summation

Wed Oct 9 23:47:25 UTC 2013

Big improvement. This looks like the right direction. DoubleStream.average() needs the same treatment for completeness.

Mike

On Oct 9 2013, at 16:38 , Joe Darcy wrote:

> On 10/08/2013 08:08 PM, Mike Duigou wrote:
>> This seems to contradict the main documentation for these methods. Perhaps instead we should remove the "The average returned can vary depending upon the order in which values are recorded. This is due to accumulated rounding error in addition of values of differing magnitudes. Values sorted by increasing absolute magnitude tend to yield more accurate results." into an @implNote?
>> 
>> I also suspect that having this documentation only in DoubleSummaryStatistics may be too hidden away. Perhaps similar docs on DoubleStream sum() and average() methods as well?
>> 
>> 
> 
> Hello,
> 
> Please review the second take on this change which is expanded to include DoubleStream.sum:
> 
>    http://cr.openjdk.java.net/~darcy/8024354.1/
> 
> Patch below,
> 
> Thanks,
> 
> -Joe
> 
> --- old/src/share/classes/java/util/DoubleSummaryStatistics.java 2013-10-09 16:37:23.000000000 -0700
> +++ new/src/share/classes/java/util/DoubleSummaryStatistics.java 2013-10-09 16:37:23.000000000 -0700
> @@ -111,12 +111,24 @@
> 
>     /**
>      * Returns the sum of values recorded, or zero if no values have been
> -     * recorded. The sum returned can vary depending upon the order in which
> -     * values are recorded. This is due to accumulated rounding error in
> -     * addition of values of differing magnitudes. Values sorted by increasing
> -     * absolute magnitude tend to yield more accurate results.  If any recorded
> -     * value is a {@code NaN} or the sum is at any point a {@code NaN} then the
> -     * sum will be {@code NaN}.
> +     * recorded.
> +     *
> +     * If any recorded value is a NaN or the sum is at any point a NaN
> +     * then the sum will be NaN.
> +     *
> +     * @apiNote The value of a floating-point sum is a function both
> +     * of the input values as well as the order of addition
> +     * operations. The order of addition operations of this method is
> +     * intentionally not defined to allow for implementation flexibility
> +     * to improve the speed and accuracy of the computed result.
> +     *
> +     * In particular, this method may be implemented using compensated
> +     * summation or other technique to reduce the error bound in the
> +     * numerical sum compared to a simple summation of {@code double}
> +     * values.
> +     *
> +     * Sorting values by increasing absolute magnitude tends to yield
> +     * more accurate results.
>      *
>      * @return the sum of values, or zero if none
>      */
> @@ -153,13 +165,21 @@
>     }
> 
>     /**
> -     * Returns the arithmetic mean of values recorded, or zero if no values have been
> -     * recorded. The average returned can vary depending upon the order in
> -     * which values are recorded. This is due to accumulated rounding error in
> -     * addition of values of differing magnitudes. Values sorted by increasing
> -     * absolute magnitude tend to yield more accurate results. If any recorded
> -     * value is a {@code NaN} or the sum is at any point a {@code NaN} then the
> -     * average will be {@code NaN}.
> +     * Returns the arithmetic mean of values recorded, or zero if no
> +     * values have been recorded.
> +     *
> +     * If any recorded value is a NaN or the sum is at any point a NaN
> +     * then the average will be code NaN.
> +     *
> +     * @apiNote The average returned can vary depending upon the order in
> +     * which values are recorded.
> +     *
> +     * This method may be implemented using compensated summation or
> +     * other technique to reduce the error bound in the numerical sum
> +     * used to compute the average.
> +     *
> +     * Values sorted by increasing absolute magnitude tend to yield
> +     * more accurate results.
>      *
>      * @return the arithmetic mean of values, or zero if none
>      */
> --- old/src/share/classes/java/util/stream/DoubleStream.java 2013-10-09 16:37:23.000000000 -0700
> +++ new/src/share/classes/java/util/stream/DoubleStream.java 2013-10-09 16:37:23.000000000 -0700
> @@ -502,22 +502,42 @@
>                   BiConsumer<R, R> combiner);
> 
>     /**
> -     * Returns the sum of elements in this stream.  The sum returned can vary
> -     * depending upon the order in which elements are encountered. This is due
> -     * to accumulated rounding error in addition of values of differing
> -     * magnitudes. Elements sorted by increasing absolute magnitude tend to
> -     * yield more accurate results.  If any stream element is a {@code NaN} or
> -     * the sum is at any point a {@code NaN} then the sum will be {@code NaN}.
> -     * This is a special case of a
> -     * <a href="package-summary.html#Reduction">reduction</a> and is
> +     * Returns the sum of elements in this stream.
> +     *
> +     * Summation is a special case of a <a
> +     * href="package-summary.html#Reduction">reduction</a>. If
> +     * floating-point summation were exact, this method would be
>      * equivalent to:
> +     *
>      * <pre>{@code
>      *     return reduce(0, Double::sum);
>      * }</pre>
>      *
> +     * However, since floating-point summation is not exact, the above
> +     * code is not necessarily equivalent to the summation computation
> +     * done by this method.
> +     *
>      * <p>This is a <a href="package-summary.html#StreamOps">terminal
>      * operation</a>.
>      *
> +     * <p>If any stream element is a NaN or the sum is at any point a NaN
> +     * then the sum will be NaN.
> +     *
> +     * @apiNote The value of a floating-point sum is a function both
> +     * of the input values as well as the order of addition
> +     * operations. The order of addition operations of this method is
> +     * intentionally not defined to allow for implementation
> +     * flexibility to improve the speed and accuracy of the computed
> +     * result.
> +     *
> +     * In particular, this method may be implemented using compensated
> +     * summation or other technique to reduce the error bound in the
> +     * numerical sum compared to a simple summation of {@code double}
> +     * values.
> +     *
> +     * Sorting values by increasing absolute magnitude tends to yield
> +     * more accurate results.
> +     *
>      * @return the sum of elements in this stream
>      */
>     double sum();
>