[aarch64-port-dev ] RFR: C2: Canonicalize (x & 16 == 16) [Was: AARCH64 optimization: using TBZ instruction for bit check]
Vladimir Kozlov
vladimir.kozlov at oracle.com
Fri Jun 19 18:36:31 UTC 2020
Nice optimization.
I don't think we should turn it off on any machine. In real application you will not see such tight loops only with such
branch. On other hand reducing code size should help in all cases.
Would be nice to know if any Java benchmark is affected.
I will try to run our set of benchmarks with these changes.
Regards,
Vladimir K
On 6/19/20 10:07 AM, Andrew Haley wrote:
> Hi,
>
> On 19/06/2020 17:49, Boris Ulasevich wrote:
>> I added the expression canonicalization in the BoolNode::Ideal method:
>> http://cr.openjdk.java.net/~bulasevich/8247408/webrev.02b
>>
>> The change reduces a number of generated machine instructions on all
>> ARM/x86/PPC architectures. Benchmark shows positive results on ARM64 and
>> ARM32 with the given change.
>>
>> On x86 benchmark performance improves from +1% to +13% depending on the
>> CPU generation, except of machines affected by Intel Erratum (JDK-8234160)
>> issue. Maximum decrease observed is -%11. It does not look like a problem
>> with the proposed benchmark though, but rather like an issue with
>> Erratum mitigation.
>>
>> On PowerPC result of the micro-benchmark is also positive. I changed the
>> micro-benchmark to make it a little bulkier so that we don't hit the
>> limitations of architectures with a less elaborate branch prediction
>> mechanism. The original application performance does not change on PowerPC.
>
> Fantastic work, thanks! You've done a remarkably thorough job. It's
> slightly unfortunate that one of the targets regresses. If there had
> been no regressions, I'd approve this straight away.
>
> Forwarding to hotspot-compiler-dev for more comments.
>
> VladimirK, what do you think? I guess we could turn this off on the
> machines affected by JDK-8234160. Should we?
>
More information about the aarch64-port-dev
mailing list