RFR 8249893: AARCH64: optimize the construction of the value from the bits of the other two

Andrew Haley aph at redhat.com
Tue Aug 25 14:55:40 UTC 2020


On 25/08/2020 10:47, Boris Ulasevich wrote:
> Ok. Can you please check that my patch [1] has been applied
> and built correctly. With my change I see this picture:
> 
> ....[Hottest Region 2]...........................................
> c2, level 4, org.openjdk.generated.Rotates_bfm_jmhTest::bfm_avgt_jmhStub,
> 
>              0x0000ffff84584dac:   add    x11, x14, #0x94
> 
>              0x0000ffff84584db0:   stp    x21, x19, [sp]
>              0x0000ffff84584db4:   stp    x20, x14, [sp, #16]
>              0x0000ffff84584db8:   stp    x15, x10, [sp, #32]
>              0x0000ffff84584dbc:   str    x11, [sp, #48]
>              0x0000ffff84584dc0:   b    0x0000ffff84584dd8
>              0x0000ffff84584dc4:   nop
>              0x0000ffff84584dc8:   nop
>              0x0000ffff84584dcc:   nop
>    3.64%  ↗  0x0000ffff84584dd0:   str    x19, [sp, #16]
>    0.07%  │  0x0000ffff84584dd4:   mov    x16, x29
>           │  0x0000ffff84584dd8:   ldr    w10, [x16, #12] ;*invokestatic bfm
>    3.92%  │  0x0000ffff84584ddc:   ldr    w12, [x16, #24]
>    4.69%  │  0x0000ffff84584de0:   and    w2, w10, #0xff
>    0.03%  │  0x0000ffff84584de4:   mov    x29, x16
>    0.02%  │  0x0000ffff84584de8:   bfi    x2, x12, #8, #8     ;*ior  {reexecute=0 rethrow=0 return_oop=0}
>           │                                                   ; - 

My apologies, I must have messed the patch up. I rebuilt cleanly. One odd thing,
though, is that it only works with some forms, and not necessarily the most
common ones.

Good:

    @Benchmark
    public static int bfm(Result r) {
        return (r.a & 0xFF) | ((r.b & 0xFF) << 8);
    }

  8.13%  ?  0x0000fffface550f0:   and	w2, w11, #0xff
  0.69%  ?  0x0000fffface550f4:   bfi	x2, x10, #8, #8             ;*ior {reexecute=0 rethrow=0 return_oop=0}

Not so good:

    @Benchmark
    public static int shift_bfm(Result r) {
        return ((r.a << 24 >>> 24)  | (r.b << 24 >>> 16));
    }

  8.56%  ?  0x0000ffff88e50e70:   lsl	w12, w11, #24
         ?  0x0000ffff88e50e74:   and	w10, w10, #0xff
  8.59%  ?  0x0000ffff88e50e78:   orr	w2, w10, w12, lsr #16       ;*ior {reexecute=0 rethrow=0 return_oop=0}

    @Benchmark
    public static int shift_sbfm(Result r) {
        return ((r.a << 24 >>> 24)  | (r.b << 24 >> 16));
    }

  9.40%  ?  0x0000ffff84e51070:   lsl	w12, w11, #24
  0.12%  ?  0x0000ffff84e51074:   and	w10, w10, #0xff
  8.06%  ?  0x0000ffff84e51078:   orr	w2, w10, w12, asr #16       ;*ior {reexecute=0 rethrow=0 return_oop=0}

Does this matter? Bits.java uses the (a & 0xff) | ((b & 0xFF) << 8) idiom so maybe
we don't care about the shift left followed by shift right form. But it feels
to me a bit unsatisfactory to miss it.

-- 
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671



More information about the hotspot-compiler-dev mailing list