[jdk16] RFR: 8260585: AArch64: Wrong code generated for shifting right and accumulating four unsigned short integers [v3]
Dong Bo
dongbo at openjdk.java.net
Tue Feb 2 08:21:48 UTC 2021
On Mon, 1 Feb 2021 07:48:35 GMT, Ningsheng Jian <njian at openjdk.org> wrote:
>> Dong Bo has updated the pull request incrementally with one additional commit since the last revision:
>>
>> make empty ins_encode when shift >= 16 (chars)
>
> Looks good to me.
Hi, Andrew.
The reason `ssra` is not generated with .8B form is that if loop size is 16, the vector length is not 8 but 4.
Because we only have `predicate(n->as_Vector()->length() == 8)` in `vsraa8B_imm`, so they are not matched.
We should fix this with the following code:
instruct vsraa8B_imm(vecD dst, vecD src, immI shift) %{
- predicate(n->as_Vector()->length() == 8);
+ predicate(n->as_Vector()->length() == 4 || n->as_Vector()->length() == 8);
match(Set dst (AddVB dst (RShiftVB src (RShiftCntV shift))));
ins_cost(INSN_COST);
format %{ "ssra $dst, $src, $shift\t# vector (8B)" %}
@@ -18782,7 +18782,7 @@ instruct vsraa16B_imm(vecX dst, vecX src, immI shift) %{
%}
instruct vsraa4S_imm(vecD dst, vecD src, immI shift) %{
- predicate(n->as_Vector()->length() == 4);
+ predicate(n->as_Vector()->length() == 2 || n->as_Vector()->length() == 4);
match(Set dst (AddVS dst (RShiftVS src (RShiftCntV shift))));
ins_cost(INSN_COST);
format %{ "ssra $dst, $src, $shift\t# vector (4H)" %}
@@ -18849,7 +18849,7 @@ instruct vsraa2L_imm(vecX dst, vecX src, immI shift) %{
%}
instruct vsrla8B_imm(vecD dst, vecD src, immI shift) %{
- predicate(n->as_Vector()->length() == 8);
+ predicate(n->as_Vector()->length() == 4 || n->as_Vector()->length() == 8);
match(Set dst (AddVB dst (URShiftVB src (RShiftCntV shift))));
ins_cost(INSN_COST);
format %{ "usra $dst, $src, $shift\t# vector (8B)" %}
@@ -18879,7 +18879,7 @@ instruct vsrla16B_imm(vecX dst, vecX src, immI shift) %{
%}
instruct vsrla4S_imm(vecD dst, vecD src, immI shift) %{
- predicate(n->as_Vector()->length() == 4);
+ predicate(n->as_Vector()->length() == 2 || n->as_Vector()->length() == 4);
match(Set dst (AddVS dst (URShiftVS src (RShiftCntV shift))));
ins_cost(INSN_COST);
format %{ "usra $dst, $src, $shift\t# vector (4H)" %}
How do you think if we do this modification together via this PR?
-------------
PR: https://git.openjdk.java.net/jdk16/pull/136
More information about the hotspot-compiler-dev
mailing list