RFR: 8293488: Add EOR3 backend rule for aarch64 SHA3 extension [v5]
Bhavana Kilambi
bkilambi at openjdk.org
Tue Nov 29 13:57:36 UTC 2022
On Tue, 29 Nov 2022 09:41:34 GMT, Nick Gasson <ngasson at openjdk.org> wrote:
>> Bhavana Kilambi has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains six commits:
>>
>> - Resolve merge conflicts with master
>> - Merge branch 'master' into JDK-8293488
>> - Removed svesha3 feature check for eor3
>> - Changed the modifier order preference in JTREG test
>> - Modified JTREG test to include feature constraints
>> - 8293488: Add EOR3 backend rule for aarch64 SHA3 extension
>>
>> Arm ISA v8.2A and v9.0A include SHA3 feature extensions and one of those
>> SHA3 instructions - "eor3" performs an exclusive OR of three vectors.
>> This is helpful in applications that have multiple, consecutive "eor"
>> operations which can be reduced by clubbing them into fewer operations
>> using the "eor3" instruction. For example -
>> eor a, a, b
>> eor a, a, c
>> can be optimized to single instruction - eor3 a, b, c
>>
>> This patch adds backend rules for Neon and SVE2 "eor3" instructions and
>> a micro benchmark to assess the performance gains with this patch.
>> Following are the results of the included micro benchmark on a 128-bit
>> aarch64 machine that supports Neon, SVE2 and SHA3 features -
>>
>> Benchmark gain
>> TestEor3.test1Int 10.87%
>> TestEor3.test1Long 8.84%
>> TestEor3.test2Int 21.68%
>> TestEor3.test2Long 21.04%
>>
>> The numbers shown are performance gains with using Neon eor3 instruction
>> over the master branch that uses multiple "eor" instructions instead.
>> Similar gains can be observed with the SVE2 "eor3" version as well since
>> the "eor3" instruction is unpredicated and the machine under test uses a
>> maximum vector width of 128 bits which makes the SVE2 code generation very
>> similar to the one with Neon.
>
> test/hotspot/gtest/aarch64/aarch64-asmtest.py line 1043:
>
>> 1041: [str(self.reg[i]) for i in range(1, self.numRegs)]))
>> 1042: def astr(self):
>> 1043: if self._name == "eor3":
>
> Suggestion:
>
> firstArg = 0 if self._name == "eor3" else 1
> formatStr = "%s%s" + ''.join([", %s" for i in range(firstArg, self.numRegs)])
>
>
> And similarly below.
Thank you for the suggestion. I made the suggested changes in the latest patch. Please review.
-------------
PR: https://git.openjdk.org/jdk/pull/10407
More information about the hotspot-dev
mailing list