[jdk16] RFR: 8260585: AArch64: Wrong code generated for shifting right and accumulating four unsigned short integers [v3]

Dong Bo dongbo at openjdk.java.net
Sun Jan 31 10:34:40 UTC 2021


On Sat, 30 Jan 2021 05:01:09 GMT, Dong Bo <dongbo at openjdk.org> wrote:

>> Yes, we need regression test for this fix. Or modify existing one to catch it.
>
> Did not run local tests for small loops in JDK-8255949.
> Updated a test for all shift and accumulating operations which can catch this.

> _Mailing list message from [Andrew Haley](mailto:aph at redhat.com) on [hotspot-dev](mailto:hotspot-dev at openjdk.java.net):_
> 
> On 1/30/21 5:07 AM, Dong Bo wrote:
> 
> I don't understand. Looking at this:
> 
> instruct vsrla4S_imm(vecD dst, vecD src, immI shift) %{
> predicate(n->as_Vector()->length() == 4);
> match(Set dst (AddVS dst (URShiftVS src (RShiftCntV shift))));
> ins_cost(INSN_COST);
> format %{ "usra $dst, $src, $shift\t# vector (4H)" %}
> ins_encode %{
> int sh = (int)$shift$$constant;
> if (sh >= 16) {
> __ eor(as_FloatRegister($src$$reg), __ T8B,
> as_FloatRegister($src$$reg),
> as_FloatRegister($src$$reg));
> } else {
> __ usra(as_FloatRegister($dst$$reg), __ T4H,
> as_FloatRegister($src$$reg), sh);
> }
> %}
> ins_pipe(vshift64_imm);
> %}
> 
> What happens when the shift is >= 16? What happens to src and dst?
> 
> --
> Andrew Haley (he/him)
> Java Platform Lead Engineer
> Red Hat UK Ltd. <https://www.redhat.com>
> https://keybase.io/andrewhaley
> EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671

This was wrong, both src and dst should have the same value as before.
Actually, when the shift is `>= 16`, the URShift is optimized to zero by the compiler.
So we don't have a `vsrla4S_imm` match if `shift >= 16`, the wrong `eor` is not generated.
Check the assembly code of the following test:
# test
public void shiftURightAccumulateChar() {
      for (int i = 0; i < count; i++) {
           charsD[i] = (char) (charsA[i] + (charsB[i] >>> 16));
      }
}
# assembly code, the `shift` is gone, only `move` left
1.17%  │   0x0000ffff88075348:   ldr  q16, [x14,#16]
         │   0x0000ffff8807534c:   add  x12, x19, x12
         │   0x0000ffff88075350:   str  q16, [x12,#16]
  1.66%  │   0x0000ffff88075354:   ldr  q16, [x14,#32]
         │   0x0000ffff88075358:   str  q16, [x12,#32]
  2.03%  │   0x0000ffff8807535c:   ldr  q16, [x14,#48]
         │   0x0000ffff88075360:   str  q16, [x12,#48]
  1.39%  │   0x0000ffff88075364:   ldr  q16, [x14,#64]
         │   0x0000ffff88075368:   str  q16, [x12,#64]

-------------

PR: https://git.openjdk.java.net/jdk16/pull/136


More information about the hotspot-compiler-dev mailing list