RFR: 8295159: DSO created with -ffast-math breaks Java floating-point arithmetic [v7]

Florian Weimer fweimer at openjdk.org
Tue Oct 18 07:49:01 UTC 2022


On Wed, 12 Oct 2022 17:00:15 GMT, Andrew Haley <aph at openjdk.org> wrote:

>> A bug in GCC causes shared libraries linked with -ffast-math to disable denormal arithmetic. This breaks Java's floating-point semantics.
>> 
>> The bug is https://gcc.gnu.org/bugzilla/show_bug.cgi?id=55522
>> 
>> One solution is to save and restore the floating-point control word around System.loadLibrary(). This isn't perfect, because some shared library might load another shared library at runtime, but it's a lot better than what we do now. 
>> 
>> However, this fix is not complete. `dlopen()` is called from many places in the JDK. I guess the best thing to do is find and wrap them all. I'd like to hear people's opinions.
>
> Andrew Haley has updated the pull request incrementally with one additional commit since the last revision:
> 
>   8295159: DSO created with -ffast-math breaks Java floating-point arithmetic

I wonder if something that focuses on diagnostic tools might be better here, particularly if there hasn't been any reported breakage. The `dlopen` protection is of course very incomplete because any JNI call can change the state in unexpected ways.

On the other hand, it seems unlikely that this change breaks some undefined but intended use of the FPU state because if it is changed in `dlopen`, it's not going to be propagated across threads.

src/hotspot/os/linux/os_linux.cpp line 1759:

> 1757:   assert(rtn == 0, "fegetnv must succeed");
> 1758:   void * result = ::dlopen(filename, RTLD_LAZY);
> 1759:   rtn = fesetenv(&curr_fenv);

`fesetenv` in glibc does not unconditionally restore the old FPU control bits, only the non-reserved bits. (The mask is 0xf3f, which seems to cover everything in use today.)  It seems unlikely that more bits are going to be defined in the future. But using `fesetenv` to essentially guard against undefined behavior after the fact is a bit awkward.

MXSCR is passed through unconditionally.

test/hotspot/jtreg/compiler/floatingpoint/libfast-math.c line 24:

> 22:  */
> 23: 
> 24: // This file is intentionally left blank.

This test will silently break (no longer test what's intended) once we change GCC not to link with `crtfastmath.o` for `-shared`. Maybe you should link with `crtfastmath.o` explicitly if it exists, or change the floating point control word directly in an ELF constructor.

-------------

PR: https://git.openjdk.org/jdk/pull/10661



More information about the build-dev mailing list