RFR: 8210416: [linux] Poor StrictMath performance due to non-optimized compilation
Gustavo Romero
gromero at linux.vnet.ibm.com
Fri Sep 14 13:28:28 UTC 2018
Hi Goetz and Severin,
On 09/14/2018 07:12 AM, Lindenmaier, Goetz wrote:
> Hi,
>
> We are only talking about reducing O3 to O2 for the compilation
> step of fdlibm, right?
Yes.
> As I know, this is being replaced by Java coding, anyways.
> Therefore I don't have a strong opinion for any of the
> choices.
I totally agree with Goetz considering OpenJDK 12 and 11u. I was actually more
concerned about a possible future backport of that change, for instance: it gets
into 8u, but Java implementations never get. Maybe I forgot to mention my
concern about a 8u backport earlier.
Severin, anyway, as Goetz is also fine with either option, I'm fine with what
you and Andrew decide as the best option.
Thanks bearing with the discussion on ppc64.
Best regards,
Gustavo
> Best regards,
> Goetz.
>
>> -----Original Message-----
>> From: core-libs-dev <core-libs-dev-bounces at openjdk.java.net> On Behalf
>> Of Gustavo Romero
>> Sent: Freitag, 14. September 2018 01:30
>> To: Severin Gehwolf <sgehwolf at redhat.com>; David Holmes
>> <david.holmes at oracle.com>; Erik Joelsson <erik.joelsson at oracle.com>;
>> build-dev <build-dev at openjdk.java.net>; core-libs-dev <core-libs-
>> dev at openjdk.java.net>
>> Subject: Re: RFR: 8210416: [linux] Poor StrictMath performance due to non-
>> optimized compilation
>>
>> Hello Severin,
>>
>> On 09/12/2018 04:48 AM, Severin Gehwolf wrote:
>>> Hi David,
>>>
>>> On Wed, 2018-09-12 at 13:57 +1000, David Holmes wrote:
>>>> Hi Severin,
>>>>
>>>> Sorry I'm a bit confused now.
>>>>
>>>> Originally for ppc/s390x/aarch64 the optimization level for fdlibm was
>>>> HIGH. But now it will be LOW due to:
>>>>
>>>> 45 ifneq ($(FDLIBM_CFLAGS), )
>>>> 46 BUILD_LIBFDLIBM_OPTIMIZATION := LOW
>>>>
>>>> as those platforms will use gcc/clang with support for -ffp-contract and
>>>> so FDLIBM_CFLAGS will not be empty.
>>>>
>>>> ??
>>>
>>> Correct. That was done based on Andrew Haley's comment here:
>>> http://mail.openjdk.java.net/pipermail/build-dev/2018-
>> September/023160.html
>>>
>>> I've done some measurements with -O2 and -O3. -O2 is sufficient for a
>>> 2.75 speedup for sin/cos on ppc64le as compared to the -O0 baseline. On
>>> the other hand the speedup of -O3 as compared to -O2 is only 1.05 for
>>> sin/cos on ppc64le.
>>>
>>> Since the difference between them were not huge, I've used -O2, a.k.a
>>> LOW. See also:
>>> http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-
>> 8210416/microbenchmarks_results/
>>>
>>> To me it seems -O2 is sufficiently good. Note that HIGH == -O3 and LOW
>>> == -O2 on gcc arches. Does that clear things up?
>>>
>>> FWIW, I'm happy to change it back to HIGH if there are concerns. See
>>> also Gustavo's reply here for some more data points:
>>> http://mail.openjdk.java.net/pipermail/build-dev/2018-
>> September/023193.html
>>
>> I'm wondering why the possible issue of using -O3 on fdlibm would not affect
>> other parts of the JVM code (like the Opto code in libjvm.so) that afaics
>> are also built using -O3. Also, the gap of 1.05 for sin/cos, for instance,
>> still sounds like a regression to me.
>>
>> Maybe a better approach to this would be to figure out a way to test -O3 and
>> -O2 with the real world case in which -O3 can have a bad impact. I'm not
>> sure if SPECjbb would qualify for that.
>>
>>
>> Best regards,
>> Gustavo
>>
>>> Thanks,
>>> Severin
>>>
>>>> David
>>>>
>>>> On 12/09/2018 2:14 AM, Severin Gehwolf wrote:
>>>>> Hi Erik,
>>>>>
>>>>> Thanks for the review!
>>>>>
>>>>> On Tue, 2018-09-11 at 08:57 -0700, Erik Joelsson wrote:
>>>>>> Hello Severin,
>>>>>>
>>>>>> Even if using the macro, I still think you need to add a condition on
>>>>>> the compiler types where the switch can be reasonably expected to
>> exist.
>>>>>
>>>>> How about this?
>>>>> http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-
>> 8210416/webrev.05/
>>>>>
>>>>> Thanks,
>>>>> Severin
>>>>>
>>>>>> On 2018-09-11 05:02, Severin Gehwolf wrote:
>>>>>>> On Mon, 2018-09-10 at 09:29 -0700, Erik Joelsson wrote:
>>>>>>>> I see. I was not aware of that issue, but we clearly need to file a bug
>>>>>>>> for it and fix it. In this case I think it's fine to us the macro however.
>>>>>>>
>>>>>>> OK. Update webrev, which now uses
>> FLAGS_COMPILER_CHECK_ARGUMENTS.
>>>>>>>
>>>>>>> http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-
>> 8210416/webrev.04/
>>>>>>>
>>>>>>> Micro-benchmark results from running [1] for x86_64 and ppc64le are
>>>>>>> here (-O2 is sufficient it seems):
>>>>>>>
>>>>>>> http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-
>> 8210416/microbenchmarks_results/
>>>>>>>
>>>>>>> More thoughts?
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Severin
>>>>>>>
>>>>>>> [1] https://github.com/gromero/strictmath/
>>>>>>>
>>>>>>
>>>>>>
>>>
>
More information about the build-dev
mailing list