RFR 8249893: AARCH64: optimize the construction of the value from the bits of the other two
Boris Ulasevich
boris.ulasevich at bell-sw.com
Tue Aug 25 17:30:02 UTC 2020
Andrew,
Thanks for looking into this.
I believe masking with left shift and right shift is not common.
Search though jdk repository does not give such patterns while
there is a hundreds of mask+lshift expressions.
I implemented a simple is_bitrange_zero() method for counting the
bitranges of sub-expressions: power-of-two masks and left shift only.
We can take into account more cases (careful testing is a main
concern). But particularly about "r.a << 24 >>> 24" expression
I think it is worse to think about canonicalization: "left shift + right
shift" to "mask + left shift" (or may be the backwards).
regards,
Boris
On 25.08.2020 17:55, Andrew Haley wrote:
> On 25/08/2020 10:47, Boris Ulasevich wrote:
>> Ok. Can you please check that my patch [1] has been applied
>> and built correctly. With my change I see this picture:
>>
>> ....[Hottest Region 2]...........................................
>> c2, level 4, org.openjdk.generated.Rotates_bfm_jmhTest::bfm_avgt_jmhStub,
>>
>> 0x0000ffff84584dac: add x11, x14, #0x94
>>
>> 0x0000ffff84584db0: stp x21, x19, [sp]
>> 0x0000ffff84584db4: stp x20, x14, [sp, #16]
>> 0x0000ffff84584db8: stp x15, x10, [sp, #32]
>> 0x0000ffff84584dbc: str x11, [sp, #48]
>> 0x0000ffff84584dc0: b 0x0000ffff84584dd8
>> 0x0000ffff84584dc4: nop
>> 0x0000ffff84584dc8: nop
>> 0x0000ffff84584dcc: nop
>> 3.64% ↗ 0x0000ffff84584dd0: str x19, [sp, #16]
>> 0.07% │ 0x0000ffff84584dd4: mov x16, x29
>> │ 0x0000ffff84584dd8: ldr w10, [x16, #12] ;*invokestatic bfm
>> 3.92% │ 0x0000ffff84584ddc: ldr w12, [x16, #24]
>> 4.69% │ 0x0000ffff84584de0: and w2, w10, #0xff
>> 0.03% │ 0x0000ffff84584de4: mov x29, x16
>> 0.02% │ 0x0000ffff84584de8: bfi x2, x12, #8, #8 ;*ior {reexecute=0 rethrow=0 return_oop=0}
>> │ ; -
> My apologies, I must have messed the patch up. I rebuilt cleanly. One odd thing,
> though, is that it only works with some forms, and not necessarily the most
> common ones.
>
> Good:
>
> @Benchmark
> public static int bfm(Result r) {
> return (r.a & 0xFF) | ((r.b & 0xFF) << 8);
> }
>
> 8.13% ? 0x0000fffface550f0: and w2, w11, #0xff
> 0.69% ? 0x0000fffface550f4: bfi x2, x10, #8, #8 ;*ior {reexecute=0 rethrow=0 return_oop=0}
>
> Not so good:
>
> @Benchmark
> public static int shift_bfm(Result r) {
> return ((r.a << 24 >>> 24) | (r.b << 24 >>> 16));
> }
>
> 8.56% ? 0x0000ffff88e50e70: lsl w12, w11, #24
> ? 0x0000ffff88e50e74: and w10, w10, #0xff
> 8.59% ? 0x0000ffff88e50e78: orr w2, w10, w12, lsr #16 ;*ior {reexecute=0 rethrow=0 return_oop=0}
>
> @Benchmark
> public static int shift_sbfm(Result r) {
> return ((r.a << 24 >>> 24) | (r.b << 24 >> 16));
> }
>
> 9.40% ? 0x0000ffff84e51070: lsl w12, w11, #24
> 0.12% ? 0x0000ffff84e51074: and w10, w10, #0xff
> 8.06% ? 0x0000ffff84e51078: orr w2, w10, w12, asr #16 ;*ior {reexecute=0 rethrow=0 return_oop=0}
>
> Does this matter? Bits.java uses the (a & 0xff) | ((b & 0xFF) << 8) idiom so maybe
> we don't care about the shift left followed by shift right form. But it feels
> to me a bit unsatisfactory to miss it.
More information about the hotspot-compiler-dev
mailing list