RFR: 8295159: DSO created with -ffast-math breaks Java floating-point arithmetic [v5]
Aleksey Shipilev
shade at openjdk.org
Wed Oct 12 16:13:12 UTC 2022
On Wed, 12 Oct 2022 14:37:37 GMT, Andrew Haley <aph at openjdk.org> wrote:
>> A bug in GCC causes shared libraries linked with -ffast-math to disable denormal arithmetic. This breaks Java's floating-point semantics.
>>
>> The bug is https://gcc.gnu.org/bugzilla/show_bug.cgi?id=55522
>>
>> One solution is to save and restore the floating-point control word around System.loadLibrary(). This isn't perfect, because some shared library might load another shared library at runtime, but it's a lot better than what we do now.
>>
>> However, this fix is not complete. `dlopen()` is called from many places in the JDK. I guess the best thing to do is find and wrap them all. I'd like to hear people's opinions.
>
> Andrew Haley has updated the pull request incrementally with one additional commit since the last revision:
>
> 8295159: DSO created with -ffast-math breaks Java floating-point arithmetic
I think the Java arithmetic correctness is the thing we are after. Allowing a stray library to break it seems very wrong. The alternative to this fix would be detecting the "bad" FPU flags like FTZ/DAZ, provided we can do that without messing with a whole lot of arch-specific code, and then asking users to opt-in for potentially breaking behavior.
But I believe current patch is the minimal thing we should do. Some nits:
src/hotspot/os/linux/os_linux.cpp line 113:
> 111: # include <sys/times.h>
> 112: # include <sys/types.h>
> 113: # include <sys/utsname.h>
Do we need to shuffle^W sort includes in this patch? I presume you'd want this patch to be cleanly backportable, which means it should probably be as point-y as it can get.
src/hotspot/os/linux/os_linux.cpp line 1747:
> 1745: void * os::Linux::dlopen_helper(const char *filename, char *ebuf,
> 1746: int ebuflen) {
> 1747: // JDK-8295159: Protect floating-point environment.
We need to be more verbose in these comments. Say something like:
There are known cases where global library initialization sets the FPU flags
that affect computation accuracy, for example, enabling Flush-To-Zero and
Denormals-Are-Zero. Do not let those libraries break the Java arithmetic.
Unfortunately, this might break the libraries that might depend on these FPU
features for performance and/or numerical "accuracy", but we need to protect
the Java semantics first and foremost. See JDK-8295159.
src/hotspot/os/linux/os_linux.cpp line 1750:
> 1748: fenv_t curr_fenv;
> 1749: int rtn = fegetenv(&curr_fenv);
> 1750: assert(rtn == 0, "must be.");
Suggestion:
assert(rtn == 0, "fegetnv must succeed");
src/hotspot/os/linux/os_linux.cpp line 1753:
> 1751: void * result = ::dlopen(filename, RTLD_LAZY);
> 1752: rtn = fesetenv(&curr_fenv);
> 1753: assert(rtn == 0, "must be.");
Suggestion:
assert(rtn == 0, "fesetenv must succeed");
test/hotspot/jtreg/compiler/floatingpoint/TestDenormalFloat.java line 28:
> 26: * @bug 8295159
> 27: * @summary DSO created with -ffast-math breaks Java floating-point arithmetic
> 28: * @run main/othervm compiler.floatingpoint.TestDenormalFloat
Should it have `/native` somewhere here?
test/hotspot/jtreg/compiler/floatingpoint/TestDenormalFloat.java line 39:
> 37: // at compiler.floatingpoint.TestDenormalFloat.testFloats(TestDenormalFloat.java:47)
> 38: // at compiler.floatingpoint.TestDenormalFloat.main(TestDenormalFloat.java:57)
> 39:
This comment is redundant, I think. There seem to be little point to run these tests separately?
test/hotspot/jtreg/compiler/floatingpoint/TestDenormalFloat.java line 60:
> 58: public static void main(String[] args) {
> 59: testFloats();
> 60: System.out.println("Loading libfast-math.so");
I propose we have the same logging statement in `TestDoubles`.
-------------
PR: https://git.openjdk.org/jdk/pull/10661
More information about the build-dev
mailing list