[PATCH] 4511638: Double.toString(double) sometimes produces incorrect results

Thu Sep 27 14:40:04 UTC 2018

Hi Raffaello,

I am the author of a recent publication on double to string conversion [1]
- the Ryu algorithm. I've been aware of the problems with the Jdk for
several years, and am very much looking forward to improvements in
correctness and performance in this area.

I have done some testing against my Java implementation of the Ryu
algorithm described in the linked paper. Interestingly, I've found a few
cases where they output different results. In particular:
1.0E-323 is printed as 9.9E-324
1.0E-322 is printed as 9.9E-323

It's likely that there are more such cases - I only ran a sample of
double-precision numbers. Arguably, 9.9 is the correctly rounded 2-digit
output and Ryu is incorrect here. That's what you get when you have a
special case for Java without a correctness proof. :-(

In terms of performance, this algorithm performs almost exactly the same as
my Java implementation of Ryu, although I'd like to point out that my C
implementation of Ryu is quite a bit faster (though note that it generates
different output, in particular, it only outputs a single digit of
precision in the above cases, rather than two), and I didn't backport all
the performance improvements from the Java version, yet. It looks like this
is not coincidence - as far as I can see so far, it's algorithmically very
similar, although it manages to avoid the loop I'm using in Ryu to find the
shortest representation.

I have a few comments:

     *     <li> It rounds to {@code v} according to the usual
round-to-closest
     *     rule of IEEE 754 floating-point arithmetic.
- Since you're spelling out the rounding rules just below, this is
duplicated, and by itself, it's unclear since it doesn't specify the
specific sub-type (round half even).

- Naming: I'd strongly suggest to use variable names that relate to what's
stored, e.g., m for mantissa, e for exponent, etc.

- What's not clear to me is how the algorithm determines how many digits to
print.

- Also, it might be nicer to move the long multiplications to a helper
method - at least from a short look, it looks like the computations of vn,
vnl, and vnr are identical.

- I looked through the spec, and it looks like all cases are well-defined.
Yay!

I will need some more time to do a more thorough review of the code and
more testing for differences. Unfortunately, I'm also traveling the next
two weeks, so this might take a bit of time.

I'm not a contributor to the Jdk, and this isn't my full-time job. I was
lurking here because I was going to send a patch for the double to string
conversion code myself (based on Ryu).

Thanks,

-- Ulf

[1] https://dl.acm.org/citation.cfm?id=3192369
[2] https://github.com/google/double-conversion
[3] https://en.wikipedia.org/wiki/Rounding

On Thu, Sep 27, 2018 at 11:03 AM Andrew Haley <aph at redhat.com> wrote:

> On 09/26/2018 06:39 PM, raffaello.giulietti at gmail.com wrote:
>
> > The submitted code contains both the changes to the current
> > implementation and extensive jtreg tests.
> >
> > While I've struggled to keep the code within the 80 chars/line limit,
> > mercurial still generates longer lines. Thus, to avoid possible problems
> > with the email systems, the code is submitted both inline and as an
> > attachment. Hope at least one copy makes its way without errors.
>
> Overall, the commenting is much too light. There are many places
> where I think I know what you're doing but you don't explain it.
>
> Here, for example:
>
> > +
> > +        // pow5 = pow51*2^63 + pow50
> > +        long pow51 = ceilPow5dHigh(-k);
> > +        long pow50 = ceilPow5dLow(-k);
> > +
> > +        // p = p2*2^126 + p1*2^63 + p0 and p = pow5 * cb
> > +        long x0 = pow50 * cb;
> > +        long x1 = multiplyHigh(pow50, cb);
> > +        long y0 = pow51 * cb;
> > +        long y1 = multiplyHigh(pow51, cb);
> > +        long z = (x1 << 1 | x0 >>> 63) + (y0 & MASK_63);
> > +        long p0 = x0 & MASK_63;
> > +        long p1 = z & MASK_63;
> > +        long p2 = (y1 << 1 | y0 >>> 63) + (z >>> 63);
> > +        long vn = p2 << 1 + ord2alpha | p1 >>> 62 - ord2alpha;
> > +        if ((p1 & mask) != 0 || p0 >= threshold) {
> > +            vn |= 1;
> > +        }
>
> ... etc. I think I can figure out what you're doing, but you could
> explain it.
>
> If you write the comments now while the code is still fresh in your
> mind it'll be easy.
>
> > +    private static final long[] ceilPow5d = {
> > +        /* -292 */ 0x7FBB_D8FE_5F5E_6E27L, 0x497A_3A27_04EE_C3DFL,
> > +        /* -291 */ 0x4FD5_679E_FB9B_04D8L, 0x5DEC_6458_6315_3A6CL,
> > +        /* -290 */ 0x63CA_C186_BA81_C60EL, 0x7567_7D6E_7BDA_8906L,
> > +        /* -289 */ 0x7CBD_71E8_6922_3792L, 0x52C1_5CCA_1AD1_2B48L,
>
> What exactly is this table anyway?  How was it generated?  Please say.
>
> There are many more places in the code. What you've done is nice, but
> it could be exemplary.
>
> --
> Andrew Haley
> Java Platform Lead Engineer
> Red Hat UK Ltd. <https://www.redhat.com>
> EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671
>