Hi Raffaello, I am the author of a recent publication on double to string conversion [1] - the Ryu algorithm. I've been aware of the problems with the Jdk for several years, and am very much looking forward to improvements in correctness and performance in this area. I have done some testing against my Java implementation of the Ryu algorithm described in the linked paper. Interestingly, I've found a few cases where they output different results. In particular: 1.0E-323 is printed as 9.9E-324 1.0E-322 is printed as 9.9E-323 It's likely that there are more such cases - I only ran a sample of double-precision numbers. Arguably, 9.9 is the correctly rounded 2-digit output and Ryu is incorrect here. That's what you get when you have a special case for Java without a correctness proof. :-( In terms of performance, this algorithm performs almost exactly the same as my Java implementation of Ryu, although I'd like to point out that my C implementation of Ryu is quite a bit faster (though note that it generates different output, in particular, it only outputs a single digit of precision in the above cases, rather than two), and I didn't backport all the performance improvements from the Java version, yet. It looks like this is not coincidence - as far as I can see so far, it's algorithmically very similar, although it manages to avoid the loop I'm using in Ryu to find the shortest representation. I have a few comments: * <li> It rounds to {@code v} according to the usual round-to-closest * rule of IEEE 754 floating-point arithmetic. - Since you're spelling out the rounding rules just below, this is duplicated, and by itself, it's unclear since it doesn't specify the specific sub-type (round half even). - Naming: I'd strongly suggest to use variable names that relate to what's stored, e.g., m for mantissa, e for exponent, etc. - What's not clear to me is how the algorithm determines how many digits to print. - Also, it might be nicer to move the long multiplications to a helper method - at least from a short look, it looks like the computations of vn, vnl, and vnr are identical. - I looked through the spec, and it looks like all cases are well-defined. Yay! I will need some more time to do a more thorough review of the code and more testing for differences. Unfortunately, I'm also traveling the next two weeks, so this might take a bit of time. I'm not a contributor to the Jdk, and this isn't my full-time job. I was lurking here because I was going to send a patch for the double to string conversion code myself (based on Ryu). Thanks, -- Ulf [1] https://dl.acm.org/citation.cfm?id=3192369 [2] https://github.com/google/double-conversion [3] https://en.wikipedia.org/wiki/Rounding On Thu, Sep 27, 2018 at 11:03 AM Andrew Haley <aph@redhat.com> wrote:
On 09/26/2018 06:39 PM, raffaello.giulietti@gmail.com wrote:
The submitted code contains both the changes to the current implementation and extensive jtreg tests.
While I've struggled to keep the code within the 80 chars/line limit, mercurial still generates longer lines. Thus, to avoid possible problems with the email systems, the code is submitted both inline and as an attachment. Hope at least one copy makes its way without errors.
Overall, the commenting is much too light. There are many places where I think I know what you're doing but you don't explain it.
Here, for example:
+ + // pow5 = pow51*2^63 + pow50 + long pow51 = ceilPow5dHigh(-k); + long pow50 = ceilPow5dLow(-k); + + // p = p2*2^126 + p1*2^63 + p0 and p = pow5 * cb + long x0 = pow50 * cb; + long x1 = multiplyHigh(pow50, cb); + long y0 = pow51 * cb; + long y1 = multiplyHigh(pow51, cb); + long z = (x1 << 1 | x0 >>> 63) + (y0 & MASK_63); + long p0 = x0 & MASK_63; + long p1 = z & MASK_63; + long p2 = (y1 << 1 | y0 >>> 63) + (z >>> 63); + long vn = p2 << 1 + ord2alpha | p1 >>> 62 - ord2alpha; + if ((p1 & mask) != 0 || p0 >= threshold) { + vn |= 1; + }
... etc. I think I can figure out what you're doing, but you could explain it.
If you write the comments now while the code is still fresh in your mind it'll be easy.
+ private static final long[] ceilPow5d = { + /* -292 */ 0x7FBB_D8FE_5F5E_6E27L, 0x497A_3A27_04EE_C3DFL, + /* -291 */ 0x4FD5_679E_FB9B_04D8L, 0x5DEC_6458_6315_3A6CL, + /* -290 */ 0x63CA_C186_BA81_C60EL, 0x7567_7D6E_7BDA_8906L, + /* -289 */ 0x7CBD_71E8_6922_3792L, 0x52C1_5CCA_1AD1_2B48L,
What exactly is this table anyway? How was it generated? Please say.
There are many more places in the code. What you've done is nice, but it could be exemplary.
-- Andrew Haley Java Platform Lead Engineer Red Hat UK Ltd. <https://www.redhat.com> EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671