[jdk16] RFR: 8260585: AArch64: Wrong code generated for shifting right and accumulating four unsigned short integers [v3]

Dong Bo dongbo at openjdk.java.net
Tue Feb 2 08:21:48 UTC 2021


On Mon, 1 Feb 2021 07:48:35 GMT, Ningsheng Jian <njian at openjdk.org> wrote:

>> Dong Bo has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   make empty ins_encode when shift >= 16 (chars)
>
> Looks good to me.

Hi, Andrew.

The reason `ssra` is not generated with .8B form is that if loop size is 16, the vector length is not 8 but 4.
Because we only have `predicate(n->as_Vector()->length() == 8)` in `vsraa8B_imm`, so they are not matched.
We should fix this with the following code:
 instruct vsraa8B_imm(vecD dst, vecD src, immI shift) %{
-  predicate(n->as_Vector()->length() == 8);
+  predicate(n->as_Vector()->length() == 4 || n->as_Vector()->length() == 8);
   match(Set dst (AddVB dst (RShiftVB src (RShiftCntV shift))));
   ins_cost(INSN_COST);
   format %{ "ssra    $dst, $src, $shift\t# vector (8B)" %}
@@ -18782,7 +18782,7 @@ instruct vsraa16B_imm(vecX dst, vecX src, immI shift) %{
 %}

 instruct vsraa4S_imm(vecD dst, vecD src, immI shift) %{
-  predicate(n->as_Vector()->length() == 4);
+  predicate(n->as_Vector()->length() == 2 || n->as_Vector()->length() == 4);
   match(Set dst (AddVS dst (RShiftVS src (RShiftCntV shift))));
   ins_cost(INSN_COST);
   format %{ "ssra    $dst, $src, $shift\t# vector (4H)" %}
@@ -18849,7 +18849,7 @@ instruct vsraa2L_imm(vecX dst, vecX src, immI shift) %{
 %}

 instruct vsrla8B_imm(vecD dst, vecD src, immI shift) %{
-  predicate(n->as_Vector()->length() == 8);
+  predicate(n->as_Vector()->length() == 4 || n->as_Vector()->length() == 8);
   match(Set dst (AddVB dst (URShiftVB src (RShiftCntV shift))));
   ins_cost(INSN_COST);
   format %{ "usra    $dst, $src, $shift\t# vector (8B)" %}
@@ -18879,7 +18879,7 @@ instruct vsrla16B_imm(vecX dst, vecX src, immI shift) %{
 %}

 instruct vsrla4S_imm(vecD dst, vecD src, immI shift) %{
-  predicate(n->as_Vector()->length() == 4);
+  predicate(n->as_Vector()->length() == 2 || n->as_Vector()->length() == 4);
   match(Set dst (AddVS dst (URShiftVS src (RShiftCntV shift))));
   ins_cost(INSN_COST);
   format %{ "usra    $dst, $src, $shift\t# vector (4H)" %}

How do you think if we do this modification together via this PR?

-------------

PR: https://git.openjdk.java.net/jdk16/pull/136


More information about the hotspot-compiler-dev mailing list