Problems with double reminder on Windows/x86_64 and Visual Studio 2005

Wed Aug 27 05:04:58 PDT 2008

Hi Volker,

I don't know if this is related, or just coincidental timing but a new 
bug report has just been filed:

6741940 Nonvolatile XMM registers not preserved across JNI calls

"Calls to the JNI entry point "CallVoidMethod" [in test program] do not 
preserve the nonvolatile XMM registers, unless running with -Xint.  This 
is in violation of the Windows 64-bit ABI:
http://msdn.microsoft.com/en-us/library/ms794547.aspx"

This won't show up on BugParade for a day or so.

Regards,
David Holmes

Volker Simonis said the following on 08/26/08 05:19:
> Hi,
> 
> we had a strange problem wich lead to failures in the JCK test
> Math2012. The problem only occured if some other JCK-Tests where
> compiled and  execuetd in a special order before the tests in
> Math2012.
> 
> I could finally track down the problem to the following simple test case:
> 
> ====================================================
> public class Log10 {
> 
>   public static double log10(double d) {
>     return Math.log10(d);
>   }
> 
>   public static double drem2(double d) {
>     return d % 2;
>   }
> 
>   public static void main(String args[]) {
>     System.out.println("log10(0) = " + Math.log10(0.0d));
>     System.out.println("log10(0) = " + log10(0.0d));
>     System.out.println("drem2(4.0) = " + drem2(4.0d));
>   }
> }
> ====================================================
> 
> which always fails on Windows/x86_64 (i.e. prints "NaN" for the result
> of 4.0 % 2.0 which should be 0.0) if executed like this:
> 
> java -Xcomp -Xbatch -XX:CompileCommand="compileonly Log10 log10"
> -XX:+PrintCompilation Log10
> 
> VM option 'CompileCommand=compileonly Log10 log10'
> VM option '+PrintCompilation'
> CompilerOracle: compileonly Log10.log10
> log10(0) = -Infinity
>   1   b   Log10::log10 (5 bytes)
> log10(0) = -Infinity
> drem2(4.0) = NaN
> 
> Notice however that we are using a version of the Java 6 HotSpot
> compiled with Visual Studio 2005.
> 
> I couldn't reproduce the problem with
> jdk-7-ea-bin-b32-windows-x64-debug-04_aug_2008 however I could verify
> that the code generated by both, our JDK 6 and the latest jdk-7 is
> virtually the same. The interesting part is the compiled version of
> the method Log10.log10():
> 
> 000     pushq   rbp
>         subq    rsp, #16        # Create frame
>         nop     # nop for patch_verified_entry
> 006     fldlg2                  #Log10
>         fyl2x                   # Q=Log10*Log_2(x)
> 024     addq    rsp, 16 # Destroy frame
>         popq    rbp
>         testl   rax, [rip + #offset_to_poll_page]       # Safepoint: poll for GC
> 02f     ret
> 
> The computation of Math.log10(0.0d) which correctly returns -Infinity
> sets the "Zero Divide" flag in the FP status word as described in
> http://www.amd.com/us-en/assets/content_type/white_papers_and_tech_docs/26569.pdf.
> 
> After this computation "SharedRuntime::drem()" is called for the
> computation of the double reminder in the method "drem2()" of the
> above example. "SharedRuntime::drem()" itself just delegates the
> computation to the "fmod()" function (defined in <math.h>) of the
> underlying platform.
> 
> The presence of the "Zero Divide" flag in the FP status word seems to
> be no problem for the "fmod()" which is used by the
> jdk-7-ea-bin-b32-windows-x64-debug-04_aug_2008 executable (from
> msvcr71.dll) and it is no problem on Linux/x86_64 either, but it IS
> definitely a problem for the "fmod()" from the "msvcr80d.dll" which is
> used in our MSVC 2005 build.
> 
> I have two questions now:
> 
> 1. Is it ok that the intrinsic for Math.log10() leaves the exceptions
> bits as they are in the FP status word?
> 2. Can somebody confirm that the described behaviour of "fmod()" from
> "msvcr80d.dll" as used by MSVC 2005 is buggy? (I couldn't find any bug
> report and I also couldn't find reference if "fmod() should depend on
> the FP status word or not.)
> 
> It would also be nice if somebody who has recent OpenJDK built with
> MSVC 2005 could confirm the above problem or if somebody could just
> confirm or disprove the "fmod()" problem within different versions of
> MSVC. Here's a small C-program which can be used to test if "fmod()"
> is dependent on the FP status word:
> 
> ====================== fmod.c ====================
> #include <math.h>
> #include <stdio.h>
> 
> extern void fpu_asm();
> 
> int main(int argc, char* argv[]) {
> 
>     double d = 0.0;
> 
>     printf("fmod(4.0, 2.0) = %f\n",  fmod(4.0, 2.0));
> 
>     fpu_asm();
> 
>     printf("fmod(4.0, 2.0) = %f\n",  fmod(4.0, 2.0));
> 
> }
> =====================================================
> 
> ===================== fpu_asm.asm ====================
> PUBLIC fpu_asm
> .CODE
> 	ALIGN	8
> fpu_asm PROC
> 	fldlg2
> 	fldz
> 	fyl2x
> 	ret
> 	ALIGN 8
> fpu_asm ENDP
> END
> ======================================================
> 
> Compile and run with:
> 
> ml64 /c fpu_asm.asm
> cl fmod.c fpu_asm.obj
> fmod.exe
> fmod(4.0, 2.0) = 0.000000
> fmod(4.0, 2.0) = -1.#IND00
> 
> Regards,
> Volker
> 
> PS: the obvious solution of calling "_clearfp()" as defined in
> <float.h> just before a call to "fmod()" unfortunately doesn't work,
> because "_clearfp()" (at least in MSVC 2005) only cleans the SSE
> status register MXCSR. The only solution I see right now is using the
> FCLEX assembler instruction, and because MSVC 2005 has no inline
> assembler for x86_64 I'll probably have to write the whole assembler
> function for the assembler instruction. Or does somebody have a
> smarter solution?
> 
> PPS: this is a nice example, how a compiler switch can get you a lot
> of fun (isn't it Kelly:) ...