RFR 8249893: AARCH64: optimize the construction of the value from the bits of the other two

Andrew Haley aph at redhat.com
Tue Aug 25 09:17:12 UTC 2020


On 25/08/2020 09:57, Boris Ulasevich wrote:
> Hi,
> 
> On 25.08.2020 11:10, Andrew Haley wrote:
>> Hi,
>>
>> On 23/08/2020 19:20, Boris Ulasevich wrote:
>>   >
>>   > Please review the updated change to C2 and AArch64 which introduces
>>   > a new BitfieldInsert node to replace Or+Shift+And sequence when
>> possible.
>>   > Single BFI instruction is emitted for the new node.
>>   >
>>   > With the current change all the transformation logic is moved out of
>>   > aarch64.ad file into the common C2 code.
>>   >
>>   > http://bugs.openjdk.java.net/browse/JDK-8249893
>>   > http://cr.openjdk.java.net/~bulasevich/8249893/webrev.01
>>   >
>>   > The change in compiler.cpp was done to implicitly ask IGVN to run
>>   > the idealization once again after the loop optimization phase.
>>   > This extra step is necessary to make the BFI transform happen
>>   > only after loop optimization.
>>
>> So here's a strange thing. When I run a simple JMH test
>>
>>       @State(Scope.Benchmark)
>>       public static class Result {
>>           public int a, b;
>>           public long x;
>>       }
>>
>>       @Benchmark
>>       public static int bfm(Result r) {
>>           return (r.a & 0xFF) | ((r.b & 0xFF) << 8);
>>       }
>>
>> I get
>>
>>     0x0000ffff84644df0:   ubfiz    w12, w11, #8, #8
>>     0x0000ffff84644df4:   and    w10, w10, #0xff
>>     0x0000ffff84644df8:   orr    w2, w10, w12                ;*ior
>> {reexecute=0 rethrow=0 return_oop=0}
>>                                                               ; -
>> org.openjdk.Rotates::bfm at 19 (line 22)
>>                                                               ; -
>> org.openjdk.generated.Rotates_bfm_jmhTest::bfm_avgt_jmhStub at 20 (line 199)
>>
>> instead of
>>
>>     0x0000ffff808554b4: and    w10, w10, #0xff
>>     0x0000ffff808554b8: and    w12, w12, #0xff
>>     0x0000ffff808554bc: orr    w2, w12, w10, lsl #8  ;*ior
>>                                                   ; -
>> org.openjdk.Rotates::bfm at 19 (line 22)
>>                                                   ; -
>> org.openjdk.generated.Rotates_bfm_jmhTest::bfm_avgt_jmhStub at 20 (line 199)
>>
>> Do you have any ideas why this might be? Thanks.
>>
> 
> Both variants are correct, isn't it?

Well, yes. But I thought that the idea was to generate fewer instructions.

> I think matcher preferred UBFIZto OR rule becauseins_costwas set to 1.9
> for OR:
> https://hg.openjdk.java.net/jdk/jdk/file/92ddc6fe60eb/src/hotspot/cpu/aarch64/aarch64.ad#l12130
> https://hg.openjdk.java.net/jdk/jdk/file/92ddc6fe60eb/src/hotspot/cpu/aarch64/aarch64.ad#l11675
> 
> With my change it would work like this:
> 
> 0x0000ffff7c587fe0:   and    w2, w10, #0xff
> 0x0000ffff7c587fe8:   bfi    x2, x12, #8, #8

But it didn't. I'm asking you why that is. The first code I showed you was the JMH test
in http://cr.openjdk.java.net/~aph/scratch/. This was after I applied your patch.

-- 
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671



More information about the hotspot-compiler-dev mailing list