[jdk17u-dev] RFR: 8265263: AArch64: Combine vneg with right shift count [v2]
Hao Sun
haosun at openjdk.org
Thu Oct 20 12:46:08 UTC 2022
On Thu, 20 Oct 2022 11:55:24 GMT, Dmitry Chuyko <dchuyko at openjdk.org> wrote:
>> This is a performance improvement for AArch64. There are several differences from the original change.
>>
>> https://bugs.openjdk.org/browse/JDK-8267356 (Vector API SVE codegen support) is not in 17u, so `UseSVE == 0` parts in predicates are missing/excluded.
>>
>> https://bugs.openjdk.org/browse/JDK-8288445 (C2 compilation fails) is a subsequent bugfix already backported in 17u, so some `immI` arguments in rules became `immI_positive`.
>>
>> https://bugs.openjdk.org/browse/JDK-8277239 (SIGSEGV in vrshift_reg_maskedNode::emit) is also related to Vector API and is not in 17u, so `!n->as_ShiftV()->is_var_shift()` is replaced by `VectorNode::is_vshift_cnt(n->in(2))`. This substitution may raise doubts.
>>
>> Testing: jtreg test/hotspot/jtreg/compiler, tier1, tier2 on aarch64.
>>
>> Performance improvements in the added benchmark VectorShiftRight on Graviton 2 for default size=1024 correspond to the original review:
>>
>>
>> rShiftByte 16%
>> rShiftInt 27%
>> rShiftLong 16%
>> rShiftShort 20%
>> urShiftByte 0%
>> urShiftChar 20%
>> urShiftInt 27%
>> urShiftLong 16%
>
> Dmitry Chuyko has updated the pull request incrementally with one additional commit since the last revision:
>
> No SVE checks in vsrcnt8B, vsrcnt16B
I noticed that there exists difference between the generated AD file from M4 file and the provided AD file.
I think we should eliminate the difference.
~/jdk//src/hotspot/cpu/aarch64$ m4 aarch64_neon_ad.m4 > aarch64_neon.ad
~/jdk/src/hotspot/cpu/aarch64$ git diff
src/hotspot/cpu/aarch64/aarch64_neon.ad | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/src/hotspot/cpu/aarch64/aarch64_neon.ad b/src/hotspot/cpu/aarch64/aarch64_neon.ad
index db50e08fffd..d43c8d31b78 100644
--- a/src/hotspot/cpu/aarch64/aarch64_neon.ad
+++ b/src/hotspot/cpu/aarch64/aarch64_neon.ad
@@ -4250,8 +4250,8 @@ instruct vxor16B(vecX dst, vecX src1, vecX src2)
// on vsra8B rule for more details.
instruct vslcnt8B(vecD dst, iRegIorL2I cnt) %{
- predicate(n->as_Vector()->length_in_bytes() == 4 ||
- n->as_Vector()->length_in_bytes() == 8);
+ predicate((n->as_Vector()->length_in_bytes() == 4 ||
+ n->as_Vector()->length_in_bytes() == 8));
match(Set dst (LShiftCntV cnt));
ins_cost(INSN_COST);
format %{ "dup $dst, $cnt\t# shift count vector (8B)" %}
@@ -4273,8 +4273,8 @@ instruct vslcnt16B(vecX dst, iRegIorL2I cnt) %{
%}
instruct vsrcnt8B(vecD dst, iRegIorL2I cnt) %{
- predicate(n->as_Vector()->length_in_bytes() == 4 ||
- n->as_Vector()->length_in_bytes() == 8);
+ predicate((n->as_Vector()->length_in_bytes() == 4 ||
+ n->as_Vector()->length_in_bytes() == 8));
match(Set dst (RShiftCntV cnt));
ins_cost(INSN_COST * 2);
format %{ "negw rscratch1, $cnt\t"
-------------
PR: https://git.openjdk.org/jdk17u-dev/pull/811
More information about the jdk-updates-dev
mailing list