Re: [PATCH] 4511638: Double.toString(double) sometimes produces incorrect results

4 Oct 2018

      On 2018-10-04 15:28, Ulf Adams wrote:
...
On Thu, Sep 27, 2018 at 5:37 PM Raffaello Giulietti
<raffaello.giulietti@gmail.com <mailto:raffaello.giulietti@gmail.com>>
wrote:
Hi Ulf,
On 2018-09-27 16:40, Ulf Adams wrote:
    > Hi Raffaello,
    >
    > I am the author of a recent publication on double to string
    conversion
    > [1] - the Ryu algorithm. I've been aware of the problems with the Jdk
    > for several years, and am very much looking forward to
    improvements in
    > correctness and performance in this area.
    >
What a coincidence! I'm happy to hear that the quest for better
    floating->string conversions has not stopped. Tomorrow I'll download
    your paper and have a look at it during the weekend.
Have you had a chance to take a look?
(I'm traveling for the next ~10 days and at a conference, so don't
expect too much from me during that time.)

I had a cursory reading but couldn't dig deeper for now. If nothing
unexpected happens, I should be able to study your paper during the weekend.
...
> I have done some testing against my Java implementation of the Ryu
    > algorithm described in the linked paper. Interestingly, I've found
    a few
    > cases where they output different results. In particular:
    > 1.0E-323 is printed as 9.9E-324
    > 1.0E-322 is printed as 9.9E-323
If Ryu also produces 1 digit long outputs, then your results above are
    correct. But then Ryu should also output 5.0E-324 rather than 4.9E-324,
    for example.
    Even better, it should output 5E-324, 1E-323 and 1E-322 because adding
    the .0 part might confuse a human reader to believe that 2 digits are
    really needed. But then 4.9E-324, 9.9E-324 and 9.9E-323 are closer to
    the double.
The C version produces 1 digit long outputs, and I was trying to follow
the Java spec in the Java version, but the code to do so isn't quite
right. Unfortunately, I haven't yet been able to fix it.

2 digits are for backward compatibility with the existing spec which
    requires at least one digit to the right of the decimal point.
>
    > It's likely that there are more such cases - I only ran a sample of
    > double-precision numbers. Arguably, 9.9 is the correctly rounded
    2-digit
    > output and Ryu is incorrect here. That's what you get when you have a
    > special case for Java without a correctness proof. :-(
    >
    > In terms of performance, this algorithm performs almost exactly
    the same
    > as my Java implementation of Ryu, although I'd like to point out
    that my
    > C implementation of Ryu is quite a bit faster (though note that it
    > generates different output, in particular, it only outputs a single
    > digit of precision in the above cases, rather than two), and I didn't
    > backport all the performance improvements from the Java version,
    yet. It
    > looks like this is not coincidence - as far as I can see so far, it's
    > algorithmically very similar, although it manages to avoid the
    loop I'm
    > using in Ryu to find the shortest representation.
    >
    > I have a few comments:
    >
    >       *     <li> It rounds to {@code v} according to the usual
    > round-to-closest
    >       *     rule of IEEE 754 floating-point arithmetic.
    > - Since you're spelling out the rounding rules just below, this is
    > duplicated, and by itself, it's unclear since it doesn't specify the
    > specific sub-type (round half even).
    >
I tried to save as much of the original spec wording as possible.
    Perhaps it isn't worthwhile.
> - Naming: I'd strongly suggest to use variable names that relate to
    > what's stored, e.g., m for mantissa, e for exponent, etc.
    >
I currently prefer to be consistent with a forthcoming paper of mine on
    the subject. But thanks for the suggestion.
May I suggest that the paper also uses names that relate to what they're
referring to? :-) Not that I've managed to do that very well myself...
I tend to use short "mathematical" names that still evoke their
semantics. Will see if I manage to be consistent.
...
> - What's not clear to me is how the algorithm determines how many
    digits
    > to print.
    >
You'll have to wait for the paper.
Looking forward to it. I tried to reverse engineer the code, but it's
far from obvious.

I don't think it is reversible, even for knowledgeable people like you :-(

I have to write the paper...
I have to write the paper...
I have to write the paper...
I have to write the paper...
I have to write the paper...
I have to write the paper...
I have to write the paper...
I have to write the paper...
I have to write the paper...
I have to write the paper...
I have to write the paper...
...
> - Also, it might be nicer to move the long multiplications to a
    helper
    > method - at least from a short look, it looks like the
    computations of
    > vn, vnl, and vnr are identical.
    >
I tried several variants: the current one seems to be the faster with
    the current optimizations of C2. Some day I'll also try with Graal.
Sure, but moving it to a method shouldn't affect performance (except if
you need to return multiple values), and, right now, it looks like
identical code.

Where possible, I try not to rely on C2 to perform inlining of "long"
code like the 8 lines for the multiplication. In the end, it is repeated
only 3 times there and in a limited space. It shouldn't come to a
surprise to a reader.
I'll certainly add a comment to warn the reader that the code is the
same and I'll retry my experiments with extracting a method for the
multiplication although it currently seems more complex than at first
sight. Unfortunately, Java does not yet support 128 bits quantities.

Greetings
Raffaello
...
> - I looked through the spec, and it looks like all cases are
    > well-defined. Yay!
    >
    > I will need some more time to do a more thorough review of the
    code and
    > more testing for differences. Unfortunately, I'm also traveling
    the next
    > two weeks, so this might take a bit of time.
    >
I thank you in advance for your willingness to review the code but my
    understanding is that only the officially appointed reviewers can
    approve OpenJDK contributions, which is of course a good policy.
    Besides, as two Andrews engineers from RedHat correctly observe,
    understanding the rationale of the code without the planned
    accompanying
    paper is hard.
> I'm not a contributor to the Jdk, and this isn't my full-time job.
    I was
    > lurking here because I was going to send a patch for the double to
    > string conversion code myself (based on Ryu).
    >
All my efforts on this projects are done in my unpaid spare time, too.
> Thanks,
    >
    > -- Ulf
    >
     > [1] https://dl.acm.org/citation.cfm?id=3192369
     > [2] https://github.com/google/double-conversion
     > [3] https://en.wikipedia.org/wiki/Rounding
     >
Thank you
    Raffaello

Re: [PATCH] 4511638: Double.toString(double) sometimes produces incorrect results

raffaello.giulietti＠gmail.com