[aarch64-port-dev ] The result of Math.log(3.0) is different on x86_64 and aarch64?

Mon Jul 29 10:21:12 UTC 2019

Hi Pengfei,

On 29/07/2019 10:52, Pengfei Li (Arm Technology China) wrote:
> 
> I've reproduced this on both JDK12 and latest JDK master (14) with below Java code.
> 
> public class Test {
>   public static void main(String[] args) {
>     double d = Math.log(3.0);
>     String hex = Long.toHexString(Double.doubleToRawLongBits(d));
>     System.out.println(d + "(0x" + hex + ")");
>   }
> }
> 
> (x86_64)~$ java Test
> 1.0986122886681098(0x3ff193ea7aad030b)
> (aarch64)~$ java Test
> 1.0986122886681096(0x3ff193ea7aad030a)
> 
> From above results we see the least significant bit differs on aarch64 from that on x86_64. So which one is more accurate?
> 
> By exploring HotSpot code, we could see a method "__ieee754_log(double x)"[1] in shared (architecture independent) code which computes the log(double). But this SharedRuntime method is called only if the architecture specific "StubRoutine::dlog()" is NULL. See [2] for this logic in the C2 compiler but there is no big difference in C1 or the interpreter. I.e. The Math.log(3.0) actually calls into some hand-crafted assembly code if it's generated. By looking into the cpu-specific stuff, we could found that the log() routine is generated on x86_64[3], but disable on aarch64[4] due to issues found before.
> 
> In another word, the Math.log() call is optimized by HotSpot intrinsics located at [5] on x86_64 but just uses __ieee_754_log() on aarch64. To prove what I've seen is right, I tested my above Java code with VM options "-XX:+UnlockDiagnosticVMOptions -XX:-InlineMathNatives" on both x86_64 and aarch64.
> 
> (x86_64)~$ java -XX:+UnlockDiagnosticVMOptions -XX:-InlineMathNatives Test
> 1.0986122886681096(0x3ff193ea7aad030a)
> (x86_64)~$ java Test
> 1.0986122886681098(0x3ff193ea7aad030b)
> DIFFERENT!
> 
> (aarch64)~$ java -XX:+UnlockDiagnosticVMOptions -XX:-InlineMathNatives Test
> 1.0986122886681096(0x3ff193ea7aad030a)
> (aarch64)~$ java Test
> 1.0986122886681096(0x3ff193ea7aad030a)
> SAME!
> 
> From the results we see that, if we have the assumption shared method
> __ieee754_log(double) is correct, there should be nothing wrong in
> aarch64 HotSpot. Instead, the x86_64 log intrinsics may have done some
> optimization that destroys the accuracy. We need an Intel engineer
> (maybe ~vdeshpande) to look at the code.
Thanks very much for looking into this and coming up with the above results.

I am not sure this is necessarily a defect. Note that the results from
the Intel intrinsic and _ieee754 differ by only one digit in the
mantissa (fractional part) i.e. by what is called 1ulp (unit of last
place). In cases where the correct value lies between two such finite
representations rounding up vs down may well not be considered an error
i.e. the required accuracy may be set at 1ulp rather than 0.5ulp.

Indeed, for some functions a greater deviation is accepted by some
well-known implementations. For example, see the following documentation
for gcc:

https://www.gnu.org/software/libc/manual/html_node/Errors-in-Math-Functions.html

When I reviewed (and rejected) the exponent intrinsic implementation
proposed for AArch64 (which was itself based on an algorithm for
computing logs) the standard adopted for 'correctness' of the intrinsic
was a difference of no more than 1ulp from the compiled C code it was
replacing. So, I don't think it is legitimate to reject the Intel
version here on these grounds. It would still be very helpful to hear a
view from from an Intel committer.

regards,

Andrew Dinn
-----------
Senior Principal Software Engineer
Red Hat UK Ltd
Registered in England and Wales under Company Registration No. 03798903
Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander