RFR: 4511638: Double.toString(double) sometimes produces incorrect results [v2]

Wed Oct 6 00:05:09 UTC 2021

On Fri, 16 Apr 2021 11:30:32 GMT, Raffaello Giulietti <github.com+70726043+rgiulietti at openjdk.org> wrote:

>> Hello,
>> 
>> here's a PR for a patch submitted on March 2020 [1](https://cr.openjdk.java.net/~bpb/4511638/webrev.04/) when Mercurial was a thing.
>> 
>> The patch has been edited to adhere to OpenJDK code conventions about multi-line (block) comments. Nothing in the code proper has changed, except for the addition of redundant but clarifying parentheses in some expressions.
>> 
>> 
>> Greetings
>> Raffaello
>
> Raffaello Giulietti has updated the pull request incrementally with one additional commit since the last revision:
> 
>   4511638: Double.toString(double) sometimes produces incorrect results

Hi,

here's Guy's original email which I got intact in my mailbox but which appears completely garbled on this PR page (if you are reading it) or does not appear at all in this mailing list (if you are reading that).

I repeat my comment as well

HTH

----

Hi, sorry for the silence. Back in April 2021 I did read the entire Schubfach paper as well as two papers about Ryu by Adams. Schubfach looks intriguing, but it depends (as so many of these algorithms do) on a subtle proof. I am glad to see that the proof, or parts of it, are machine-checked in some way. I would be a lot more confident if this proof were also subjected to peer review. These algorithms are very much like the buggy Pentium FDIV: nearly all the cases are easy, but the cases that are hard (and therefore may potentially fail if you get that part of the algorithm wrong) are very rare, so doing good testing is tricky and deserves attention. If we can identify those hard cases (typically related to edge cases in the table entries) perhaps we can develop a good testing methodology. I would suspect that these hard cases relate to situations where a power of (1/5) is very close to a power of (1/2) (and therefore the corresponding power of 5 is very close to a power of 2), but t
 he details matter.

Now that I have finished addressing bug https://bugs.openjdk.java.net/browse/JDK-8273792 (JumpableGenerator.rngs() documentation refers to wrong method) and bug https://bugs.openjdk.java.net/browse/JDK-8273056 (java.util.random does not correctly sample exponential or Gaussian distributions), I am now re-reading the Schubfach paper and am also investigating mentions of an implementation of that algorithm called DragonBox. I will have more to say by Friday (I have a Thursday deadline for something else).

—Guy

----

Hello,

thanks for reading my writing.

In some way, Ryu and Schubfach are similar in the sense that both are based on the observation that good, fixed size approximations of powers of 10 lead to extremely efficient algorithms. Both algorithms end up with a lookup table of 617 approximations.

The proof of the central theorem of Schubfach is based on continued fractions. It was prepared on ACL2 by the late Dmitry Nadezhin, who was a member of Oracle's formal verification group. Dmitry also knew my writing inside-out: not a formal peer review but perhaps even more insightful.

Schubfach has been exhaustively tested on all 2^32 floats, with approximations of powers of 10 of 63 bits each, rather than the full 126 bits used for doubles (the minimum size for doubles is 123 bits).
It has also been tested for months on doubles, accumulating several trillions of witnesses and no failure.
The exhaustive test on floats is an optionally executed part of the contributed code.
After more than 25 years the current JDK implementation still contains anomalies, so I guess it isn't tested as well as Schubfach.

A complete Schubfach conversion only depends on some dozens of primitive operations (it doesn't even need any sort of hardware division at all) and on a lookup table (fully tested for correctness in the contributed code).
The current implementation in the JDK relies on the correctness of BigInteger and related classes, which are way more complex to thoroughly test and to be confident upon.

Checking whether the 617 tabulated powers of 10 are very close to powers of 2 would be very easy, so I'm not sure I understand your point.
Identifying the hard cases would require even more theory and would be an endeavor in itself. This theory could, in turn, contain small errors and would probably require identifying its hard cases...

According to its author, DragonBox combines Schubfach and Grisu. AFAIK, there's no new fundamental underlying theory.
During development of Schubfach, I experimented with a similar mix: on the JVM of the time (2018) performance was worse than pure Schubfach and the code more complex, so I gave up and switched back to the simpler design, which C2 likes better.

Greetings
Raffaello

-------------

PR: https://git.openjdk.java.net/jdk/pull/3402