RFR 8249893: AARCH64: optimize the construction of the value from the bits of the other two
Andrew Haley
aph at redhat.com
Tue Aug 25 09:17:12 UTC 2020
On 25/08/2020 09:57, Boris Ulasevich wrote:
> Hi,
>
> On 25.08.2020 11:10, Andrew Haley wrote:
>> Hi,
>>
>> On 23/08/2020 19:20, Boris Ulasevich wrote:
>> >
>> > Please review the updated change to C2 and AArch64 which introduces
>> > a new BitfieldInsert node to replace Or+Shift+And sequence when
>> possible.
>> > Single BFI instruction is emitted for the new node.
>> >
>> > With the current change all the transformation logic is moved out of
>> > aarch64.ad file into the common C2 code.
>> >
>> > http://bugs.openjdk.java.net/browse/JDK-8249893
>> > http://cr.openjdk.java.net/~bulasevich/8249893/webrev.01
>> >
>> > The change in compiler.cpp was done to implicitly ask IGVN to run
>> > the idealization once again after the loop optimization phase.
>> > This extra step is necessary to make the BFI transform happen
>> > only after loop optimization.
>>
>> So here's a strange thing. When I run a simple JMH test
>>
>> @State(Scope.Benchmark)
>> public static class Result {
>> public int a, b;
>> public long x;
>> }
>>
>> @Benchmark
>> public static int bfm(Result r) {
>> return (r.a & 0xFF) | ((r.b & 0xFF) << 8);
>> }
>>
>> I get
>>
>> 0x0000ffff84644df0: ubfiz w12, w11, #8, #8
>> 0x0000ffff84644df4: and w10, w10, #0xff
>> 0x0000ffff84644df8: orr w2, w10, w12 ;*ior
>> {reexecute=0 rethrow=0 return_oop=0}
>> ; -
>> org.openjdk.Rotates::bfm at 19 (line 22)
>> ; -
>> org.openjdk.generated.Rotates_bfm_jmhTest::bfm_avgt_jmhStub at 20 (line 199)
>>
>> instead of
>>
>> 0x0000ffff808554b4: and w10, w10, #0xff
>> 0x0000ffff808554b8: and w12, w12, #0xff
>> 0x0000ffff808554bc: orr w2, w12, w10, lsl #8 ;*ior
>> ; -
>> org.openjdk.Rotates::bfm at 19 (line 22)
>> ; -
>> org.openjdk.generated.Rotates_bfm_jmhTest::bfm_avgt_jmhStub at 20 (line 199)
>>
>> Do you have any ideas why this might be? Thanks.
>>
>
> Both variants are correct, isn't it?
Well, yes. But I thought that the idea was to generate fewer instructions.
> I think matcher preferred UBFIZto OR rule becauseins_costwas set to 1.9
> for OR:
> https://hg.openjdk.java.net/jdk/jdk/file/92ddc6fe60eb/src/hotspot/cpu/aarch64/aarch64.ad#l12130
> https://hg.openjdk.java.net/jdk/jdk/file/92ddc6fe60eb/src/hotspot/cpu/aarch64/aarch64.ad#l11675
>
> With my change it would work like this:
>
> 0x0000ffff7c587fe0: and w2, w10, #0xff
> 0x0000ffff7c587fe8: bfi x2, x12, #8, #8
But it didn't. I'm asking you why that is. The first code I showed you was the JMH test
in http://cr.openjdk.java.net/~aph/scratch/. This was after I applied your patch.
--
Andrew Haley (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671
More information about the hotspot-compiler-dev
mailing list