PPC64: Poor StrictMath performance due to non-optimized compilation
Erik Joelsson
erik.joelsson at oracle.com
Thu Nov 17 09:17:33 UTC 2016
Hello,
Overall this looks reasonable to me. However, if we want to introduce a
new possible tuple for specifying compilation flags to
SetupNativeCompilation, we (the build team) would prefer if we used
OPENJDK_TARGET_CPU instead of OPENJDK_TARGET_CPU_ARCH.
/Erik
On 2016-11-17 03:31, David Holmes wrote:
> Adding in build-dev as they need to scrutinize all build changes.
>
> David
>
> On 17/11/2016 11:45 AM, Gustavo Romero wrote:
>> Hi,
>>
>> Currently, optimization for building fdlibm is disabled, except for the
>> "solaris" OS target [1].
>>
>> As a consequence on PPC64 (Linux) StrictMath methods like, but not
>> limited to,
>> sin(), cos(), and tan() perform verify poor in comparison to the same
>> methods
>> in Math class [2]:
>>
>> Math StrictMath
>> ========= ==========
>> sin 0m29.984s 1m41.184s
>> cos 0m30.031s 1m41.200s
>> tan 0m31.772s 1m46.976s
>> asin 0m4.577s 0m4.543s
>> acos 0m4.539s 0m4.525s
>> atan 0m12.929s 0m12.896s
>> exp 0m1.071s 0m4.570s
>> log 0m3.272s 0m14.239s
>> log10 0m4.362s 0m20.236s
>> sqrt 0m0.913s 0m0.981s
>> cbrt 0m10.786s 0m10.808s
>> sinh 0m4.438s 0m4.433s
>> cosh 0m4.496s 0m4.478s
>> tanh 0m3.360s 0m3.353s
>> expm1 0m4.076s 0m4.094s
>> log1p 0m13.518s 0m13.527s
>> IEEEremainder 0m38.803s 0m38.909s
>> atan2 0m20.100s 0m20.057s
>> pow 0m14.096s 0m19.938s
>> hypot 0m5.136s 0m5.122s
>>
>>
>> Switching on the O3 optimization can damage precision of those methods,
>> nonetheless it's possible to avoid that side effect and yet get huge
>> benefits of
>> the -O3 optimization on PPC64 if -fno-expensive-optimizations is
>> passed in
>> addition to the -O3 optimization flag.
>>
>> In that sense the following change is proposed to resolve the issue:
>>
>> diff -r 81eb4bd34611 make/lib/CoreLibraries.gmk
>> --- a/make/lib/CoreLibraries.gmk Wed Nov 09 13:37:19 2016 +0100
>> +++ b/make/lib/CoreLibraries.gmk Wed Nov 16 19:11:11 2016 -0500
>> @@ -33,10 +33,16 @@
>> # libfdlibm is statically linked with libjava below and not
>> delivered into the
>> # product on its own.
>>
>> -BUILD_LIBFDLIBM_OPTIMIZATION := HIGH
>> +BUILD_LIBFDLIBM_OPTIMIZATION := NONE
>>
>> -ifneq ($(OPENJDK_TARGET_OS), solaris)
>> - BUILD_LIBFDLIBM_OPTIMIZATION := NONE
>> +ifeq ($(OPENJDK_TARGET_OS), solaris)
>> + BUILD_LIBFDLIBM_OPTIMIZATION := HIGH
>> +endif
>> +
>> +ifeq ($(OPENJDK_TARGET_OS), linux)
>> + ifeq ($(OPENJDK_TARGET_CPU_ARCH), ppc)
>> + BUILD_LIBFDLIBM_OPTIMIZATION := HIGH
>> + endif
>> endif
>>
>> LIBFDLIBM_SRC := $(JDK_TOPDIR)/src/java.base/share/native/libfdlibm
>> @@ -51,6 +57,7 @@
>> CFLAGS := $(CFLAGS_JDKLIB) $(LIBFDLIBM_CFLAGS), \
>> CFLAGS_windows_debug := -DLOGGING, \
>> CFLAGS_aix := -qfloat=nomaf, \
>> + CFLAGS_linux_ppc := -fno-expensive-optimizations, \
>> DISABLED_WARNINGS_gcc := sign-compare, \
>> DISABLED_WARNINGS_microsoft := 4146 4244 4018, \
>> ARFLAGS := $(ARFLAGS), \
>>
>>
>> diff -r 2a1f97c0ad3d make/common/NativeCompilation.gmk
>> --- a/make/common/NativeCompilation.gmk Wed Nov 09 15:32:39 2016
>> +0100
>> +++ b/make/common/NativeCompilation.gmk Wed Nov 16 19:08:06 2016
>> -0500
>> @@ -569,16 +569,19 @@
>> $1_ALL_OBJS := $$(sort $$($1_EXPECTED_OBJS)
>> $$($1_EXTRA_OBJECT_FILES))
>>
>> # Pickup extra OPENJDK_TARGET_OS_TYPE and/or OPENJDK_TARGET_OS
>> dependent variables for CFLAGS.
>> - $1_EXTRA_CFLAGS:=$$($1_CFLAGS_$(OPENJDK_TARGET_OS_TYPE))
>> $$($1_CFLAGS_$(OPENJDK_TARGET_OS))
>> + $1_EXTRA_CFLAGS:=$$($1_CFLAGS_$(OPENJDK_TARGET_OS_TYPE))
>> $$($1_CFLAGS_$(OPENJDK_TARGET_OS)) \
>> + $$($1_CFLAGS_$(OPENJDK_TARGET_OS)_$(OPENJDK_TARGET_CPU_ARCH))
>> ifneq ($(DEBUG_LEVEL),release)
>> # Pickup extra debug dependent variables for CFLAGS
>> $1_EXTRA_CFLAGS+=$$($1_CFLAGS_debug)
>> $1_EXTRA_CFLAGS+=$$($1_CFLAGS_$(OPENJDK_TARGET_OS_TYPE)_debug)
>> $1_EXTRA_CFLAGS+=$$($1_CFLAGS_$(OPENJDK_TARGET_OS)_debug)
>> +
>> $1_EXTRA_CFLAGS+=$$($1_CFLAGS_$(OPENJDK_TARGET_OS)_$(OPENJDK_TARGET_CPU_ARCH)_debug)
>> else
>> $1_EXTRA_CFLAGS+=$$($1_CFLAGS_release)
>> $1_EXTRA_CFLAGS+=$$($1_CFLAGS_$(OPENJDK_TARGET_OS_TYPE)_release)
>> $1_EXTRA_CFLAGS+=$$($1_CFLAGS_$(OPENJDK_TARGET_OS)_release)
>> +
>> $1_EXTRA_CFLAGS+=$$($1_CFLAGS_$(OPENJDK_TARGET_OS)_$(OPENJDK_TARGET_CPU_ARCH)_release)
>> endif
>>
>> # Pickup extra OPENJDK_TARGET_OS_TYPE and/or OPENJDK_TARGET_OS
>> dependent variables for CXXFLAGS.
>>
>>
>> After enabling the optimization it's possible to again up to 3x on
>> performance
>> regarding the aforementioned methods without losing precision:
>>
>> StrictMath, original StrictMath, optimized
>> ============================
>> ============================
>> sin 1.7136493465700542 1m41.184s 1.7136493465700542
>> 0m33.895s
>> cos 0.1709843554185943 1m41.200s 0.1709843554185943
>> 0m33.884s
>> tan -5.5500322522995315E7 1m46.976s
>> -5.5500322522995315E7 0m36.461s
>> asin NaN 0m4.543s
>> NaN 0m3.175s
>> acos NaN 0m4.525s
>> NaN 0m3.211s
>> atan 1.5707961389886132E8 0m12.896s
>> 1.5707961389886132E8 0m7.100s
>> exp Infinity 0m4.570s Infinity 0m3.187s
>> log 1.7420680845245087E9 0m14.239s
>> 1.7420680845245087E9 0m7.170s
>> log10 7.565705562087342E8 0m20.236s 7.565705562087342E8
>> 0m9.610s
>> sqrt 6.66666671666567E11 0m0.981s 6.66666671666567E11
>> 0m0.948s
>> cbrt 3.481191648389617E10 0m10.808s 3.481191648389617E10
>> 0m10.786s
>> sinh Infinity 0m4.433s Infinity 0m3.179s
>> cosh Infinity 0m4.478s Infinity 0m3.174s
>> tanh 9.999999971990079E7 0m3.353s 9.999999971990079E7
>> 0m3.208s
>> expm1 Infinity 0m4.094s Infinity 0m3.185s
>> log1p 1.7420681029451895E9 0m13.527s
>> 1.7420681029451895E9 0m8.756s
>> IEEEremainder 502000.0 0m38.909s 502000.0 0m14.055s
>> atan2 1.570453905253704E8 0m20.057s 1.570453905253704E8
>> 0m10.510s
>> pow Infinity 0m19.938s Infinity 0m20.204s
>> hypot 5.000000099033372E15 0m5.122s
>> 5.000000099033372E15 0m5.130s
>>
>>
>> I believe that as the FC is passed but FEC is not the change can,
>> after the due
>> scrutiny and review, be pushed if a special exception approval grants
>> it. Once
>> on 9, I'll request the downport to 8.
>>
>> Could I open a bug to address that issue?
>>
>> Thank you very much.
>>
>>
>> Regards,
>> Gustavo
>>
>> [1]
>> http://hg.openjdk.java.net/jdk9/hs/jdk/file/81eb4bd34611/make/lib/CoreLibraries.gmk#l39
>> [2] https://github.com/gromero/strictmath (comparison script used to
>> get the results)
>>
More information about the ppc-aix-port-dev
mailing list