On 2018-10-04 15:28, Ulf Adams wrote:
On Thu, Sep 27, 2018 at 5:37 PM Raffaello Giulietti <raffaello.giulietti@gmail.com <mailto:raffaello.giulietti@gmail.com>> wrote:
Hi Ulf,
On 2018-09-27 16:40, Ulf Adams wrote: > Hi Raffaello, > > I am the author of a recent publication on double to string conversion > [1] - the Ryu algorithm. I've been aware of the problems with the Jdk > for several years, and am very much looking forward to improvements in > correctness and performance in this area. >
What a coincidence! I'm happy to hear that the quest for better floating->string conversions has not stopped. Tomorrow I'll download your paper and have a look at it during the weekend.
Have you had a chance to take a look?
(I'm traveling for the next ~10 days and at a conference, so don't expect too much from me during that time.)
I had a cursory reading but couldn't dig deeper for now. If nothing unexpected happens, I should be able to study your paper during the weekend.
> I have done some testing against my Java implementation of the Ryu > algorithm described in the linked paper. Interestingly, I've found a few > cases where they output different results. In particular: > 1.0E-323 is printed as 9.9E-324 > 1.0E-322 is printed as 9.9E-323
If Ryu also produces 1 digit long outputs, then your results above are correct. But then Ryu should also output 5.0E-324 rather than 4.9E-324, for example. Even better, it should output 5E-324, 1E-323 and 1E-322 because adding the .0 part might confuse a human reader to believe that 2 digits are really needed. But then 4.9E-324, 9.9E-324 and 9.9E-323 are closer to the double.
The C version produces 1 digit long outputs, and I was trying to follow the Java spec in the Java version, but the code to do so isn't quite right. Unfortunately, I haven't yet been able to fix it.
2 digits are for backward compatibility with the existing spec which requires at least one digit to the right of the decimal point.
> > It's likely that there are more such cases - I only ran a sample of > double-precision numbers. Arguably, 9.9 is the correctly rounded 2-digit > output and Ryu is incorrect here. That's what you get when you have a > special case for Java without a correctness proof. :-( > > In terms of performance, this algorithm performs almost exactly the same > as my Java implementation of Ryu, although I'd like to point out that my > C implementation of Ryu is quite a bit faster (though note that it > generates different output, in particular, it only outputs a single > digit of precision in the above cases, rather than two), and I didn't > backport all the performance improvements from the Java version, yet. It > looks like this is not coincidence - as far as I can see so far, it's > algorithmically very similar, although it manages to avoid the loop I'm > using in Ryu to find the shortest representation. > > I have a few comments: > > * <li> It rounds to {@code v} according to the usual > round-to-closest > * rule of IEEE 754 floating-point arithmetic. > - Since you're spelling out the rounding rules just below, this is > duplicated, and by itself, it's unclear since it doesn't specify the > specific sub-type (round half even). >
I tried to save as much of the original spec wording as possible. Perhaps it isn't worthwhile.
> - Naming: I'd strongly suggest to use variable names that relate to > what's stored, e.g., m for mantissa, e for exponent, etc. >
I currently prefer to be consistent with a forthcoming paper of mine on the subject. But thanks for the suggestion.
May I suggest that the paper also uses names that relate to what they're referring to? :-) Not that I've managed to do that very well myself...
I tend to use short "mathematical" names that still evoke their semantics. Will see if I manage to be consistent.
> - What's not clear to me is how the algorithm determines how many digits > to print. >
You'll have to wait for the paper.
Looking forward to it. I tried to reverse engineer the code, but it's far from obvious.
I don't think it is reversible, even for knowledgeable people like you :-( I have to write the paper... I have to write the paper... I have to write the paper... I have to write the paper... I have to write the paper... I have to write the paper... I have to write the paper... I have to write the paper... I have to write the paper... I have to write the paper... I have to write the paper...
> - Also, it might be nicer to move the long multiplications to a helper > method - at least from a short look, it looks like the computations of > vn, vnl, and vnr are identical. >
I tried several variants: the current one seems to be the faster with the current optimizations of C2. Some day I'll also try with Graal.
Sure, but moving it to a method shouldn't affect performance (except if you need to return multiple values), and, right now, it looks like identical code.
Where possible, I try not to rely on C2 to perform inlining of "long" code like the 8 lines for the multiplication. In the end, it is repeated only 3 times there and in a limited space. It shouldn't come to a surprise to a reader. I'll certainly add a comment to warn the reader that the code is the same and I'll retry my experiments with extracting a method for the multiplication although it currently seems more complex than at first sight. Unfortunately, Java does not yet support 128 bits quantities. Greetings Raffaello
> - I looked through the spec, and it looks like all cases are > well-defined. Yay! > > I will need some more time to do a more thorough review of the code and > more testing for differences. Unfortunately, I'm also traveling the next > two weeks, so this might take a bit of time. >
I thank you in advance for your willingness to review the code but my understanding is that only the officially appointed reviewers can approve OpenJDK contributions, which is of course a good policy. Besides, as two Andrews engineers from RedHat correctly observe, understanding the rationale of the code without the planned accompanying paper is hard.
> I'm not a contributor to the Jdk, and this isn't my full-time job. I was > lurking here because I was going to send a patch for the double to > string conversion code myself (based on Ryu). >
All my efforts on this projects are done in my unpaid spare time, too.
> Thanks, > > -- Ulf > > [1] https://dl.acm.org/citation.cfm?id=3192369 > [2] https://github.com/google/double-conversion > [3] https://en.wikipedia.org/wiki/Rounding >
Thank you Raffaello