RFR: 8210416: [linux] Poor StrictMath performance due to non-optimized compilation

Thu Sep 13 01:44:08 UTC 2018

Hello,

On 9/12/2018 1:16 AM, Severin Gehwolf wrote:
> On Wed, 2018-09-12 at 17:58 +1000, David Holmes wrote:
>> But I don't understand why the optimization setting is being tied to the
>> availability of the -ffp-contract flag?
> In configure we perform a check for gcc or clang whether that flag is
> supported. If it is, it would be non-empty exactly having -ffp-contract
> as value. It could be another set of flags for other arches if somebody
> wanted to do the same, fwiw. In JDK 8, for example, it's "-mno-fused-
> madd -fno-strict-aliasing" for ppc64:
> http://hg.openjdk.java.net/jdk8u/jdk8u-dev/jdk/file/2660b127b407/make/lib/CoreLibraries.gmk#l63
>
> We need support for that flag (or a set of flags) when we optimize
> fdlibm since otherwise we would lose precision. If the flag is empty
> we'd not optimize as we can't guarantee precision. That's why we tie
> optimization to the availability of that flag. The expectation is for
> this flag to be available on gcc/clang arches only at this point. Does
> that make sense?
>
>

To condense a potentially long discussion, while the IEEE 754 standard 
has long specified particular results for arithmetic operations (+, -, 
*, /, etc.) on particular floating-point values, languages and their 
compilers often do not provide a reliable mapping of language constructs 
to IEEE 754 operations.

The Java language and JVM are distinctive in this sense because a 
reliable mapping of language-level operation to particular IEEE 754 
operation is mandated by the JLS. (I will leave aside a complicated but 
largely irrelevant discussion of non-strictfp floating-point.)

The C language standards I've looked at do not provide as reliably a 
mapping of floating-point operations as the JLS does. In particular, the 
C standards generally allow a fused multiply add to be used replace a 
pair of add and multiply instructions in an expression like (a * b + c). 
The -ffp-contract=off gcc compiler setting disables this and related 
transformations. (The Sun Studio compilers provide detailed 
configuration options for the sets of floating-point transformations 
that are allowed.)

The specification for StrictMath requires the fdlibm algorithms and the 
fdlibm algorithms rely on the semantics of the floating-point operations 
as written in the source and also rely on some way of doing a bit-wise 
conversion between an integral type and double. The latter is 
accomplished by interpreting the 64-bits of a double as comprising a 
two-element array of 32-bit ints. These idioms often don't work under 
the default C compiler options, leading to the long-standing need to 
have a separate set of compiler options for FDLIBM. A safe, if slow, set 
of options is to fully disable optimization for FDLIBM. That is not 
necessary if sufficient control over the floating-point and aliasing 
semantics is possible via the C compiler options.

In the fullness of time, when (if?) I finish porting the FDLIBM code to 
Java, these sorts of concerns will no longer apply due to the more 
reliably mapping of source expressions in Java to IEEE 754 
floating-point operations.

HTH,

-Joe