RFR: 8295159: DSO created with -ffast-math breaks Java floating-point arithmetic [v7]
Vladimir Ivanov
vlivanov at openjdk.org
Mon Oct 17 19:03:53 UTC 2022
On Wed, 12 Oct 2022 17:00:15 GMT, Andrew Haley <aph at openjdk.org> wrote:
>> A bug in GCC causes shared libraries linked with -ffast-math to disable denormal arithmetic. This breaks Java's floating-point semantics.
>>
>> The bug is https://gcc.gnu.org/bugzilla/show_bug.cgi?id=55522
>>
>> One solution is to save and restore the floating-point control word around System.loadLibrary(). This isn't perfect, because some shared library might load another shared library at runtime, but it's a lot better than what we do now.
>>
>> However, this fix is not complete. `dlopen()` is called from many places in the JDK. I guess the best thing to do is find and wrap them all. I'd like to hear people's opinions.
>
> Andrew Haley has updated the pull request incrementally with one additional commit since the last revision:
>
> 8295159: DSO created with -ffast-math breaks Java floating-point arithmetic
FTR I did an exercise in source code archeology and here are my findings.
The origin of `AlwaysRestoreFPU`-related code (both in x86-32 and arm-specific code) can be traced back to [JDK-6487931](https://bugs.openjdk.org/browse/JDK-6487931) and [JDK-6550813](https://bugs.openjdk.org/browse/JDK-6550813). Though both issues manifested as JVM crashes, the underlying problem was identified as FPU control word corruption by native code.
The regression test does trigger the corruption from a JNI call (using either `_FPU_SETCW` [1] or `_controlfp` [2]), but it was deliberately limited to x86-32.
Based on that, I conclude that the problem with FP environment corruption by native code was known before, but an opt-in solution was chosen.
(Frankly speaking, I don't know why an opt-in solution was considered sufficient. Maybe because it was erroneously believed it can only lead to a crash at runtime?)
Considering we are now aware about insidious nature of the problem (silent result corruption), I'm inclined to propose either to turn on `AlwaysRestoreFPU` by default (and provide implementation on platforms where it is missed) or, at least, catch FPU control word corruption on native->java transitions and crash the JVM advertising `-XX:+AlwaysRestoreFPU` as a solution.
[1] https://man7.org/linux/man-pages/man3/__setfpucw.3.html
[2] https://learn.microsoft.com/en-us/cpp/c-runtime-library/reference/control87-controlfp-control87-2?view=msvc-170
-------------
PR: https://git.openjdk.org/jdk/pull/10661
More information about the build-dev
mailing list