RFR 8249893: AARCH64: optimize the construction of the value from the bits of the other two

Boris Ulasevich boris.ulasevich at bell-sw.com
Tue Aug 25 09:47:11 UTC 2020


On 25.08.2020 12:17, Andrew Haley wrote:
> On 25/08/2020 09:57, Boris Ulasevich wrote:
>> Hi,
>>
>> On 25.08.2020 11:10, Andrew Haley wrote:
>>> Hi,
>>>
>>> On 23/08/2020 19:20, Boris Ulasevich wrote:
>>>   >
>>>   > Please review the updated change to C2 and AArch64 which introduces
>>>   > a new BitfieldInsert node to replace Or+Shift+And sequence when
>>> possible.
>>>   > Single BFI instruction is emitted for the new node.
>>>   >
>>>   > With the current change all the transformation logic is moved 
>>> out of
>>>   > aarch64.ad file into the common C2 code.
>>>   >
>>>   > http://bugs.openjdk.java.net/browse/JDK-8249893
>>>   > http://cr.openjdk.java.net/~bulasevich/8249893/webrev.01
>>>   >
>>>   > The change in compiler.cpp was done to implicitly ask IGVN to run
>>>   > the idealization once again after the loop optimization phase.
>>>   > This extra step is necessary to make the BFI transform happen
>>>   > only after loop optimization.
>>>
>>> So here's a strange thing. When I run a simple JMH test
>>>
>>>       @State(Scope.Benchmark)
>>>       public static class Result {
>>>           public int a, b;
>>>           public long x;
>>>       }
>>>
>>>       @Benchmark
>>>       public static int bfm(Result r) {
>>>           return (r.a & 0xFF) | ((r.b & 0xFF) << 8);
>>>       }
>>>
>>> I get
>>>
>>>     0x0000ffff84644df0:   ubfiz    w12, w11, #8, #8
>>>     0x0000ffff84644df4:   and    w10, w10, #0xff
>>>     0x0000ffff84644df8:   orr    w2, w10, w12 ;*ior
>>> {reexecute=0 rethrow=0 return_oop=0}
>>> ; -
>>> org.openjdk.Rotates::bfm at 19 (line 22)
>>> ; -
>>> org.openjdk.generated.Rotates_bfm_jmhTest::bfm_avgt_jmhStub at 20 (line 
>>> 199)
>>>
>>> instead of
>>>
>>>     0x0000ffff808554b4: and    w10, w10, #0xff
>>>     0x0000ffff808554b8: and    w12, w12, #0xff
>>>     0x0000ffff808554bc: orr    w2, w12, w10, lsl #8  ;*ior
>>>                                                   ; -
>>> org.openjdk.Rotates::bfm at 19 (line 22)
>>>                                                   ; -
>>> org.openjdk.generated.Rotates_bfm_jmhTest::bfm_avgt_jmhStub at 20 (line 
>>> 199)
>>>
>>> Do you have any ideas why this might be? Thanks.
>>>
>>
>> Both variants are correct, isn't it?
>
> Well, yes. But I thought that the idea was to generate fewer 
> instructions.
>
>> I think matcher preferred UBFIZto OR rule becauseins_costwas set to 1.9
>> for OR:
>> https://hg.openjdk.java.net/jdk/jdk/file/92ddc6fe60eb/src/hotspot/cpu/aarch64/aarch64.ad#l12130 
>>
>> https://hg.openjdk.java.net/jdk/jdk/file/92ddc6fe60eb/src/hotspot/cpu/aarch64/aarch64.ad#l11675 
>>
>>
>> With my change it would work like this:
>>
>> 0x0000ffff7c587fe0:   and    w2, w10, #0xff
>> 0x0000ffff7c587fe8:   bfi    x2, x12, #8, #8
>
> But it didn't. I'm asking you why that is. The first code I showed you 
> was the JMH test
> in http://cr.openjdk.java.net/~aph/scratch/. This was after I applied 
> your patch.

Ok. Can you please check that my patch [1] has been applied
and built correctly. With my change I see this picture:

....[Hottest Region 2]...........................................
c2, level 4, org.openjdk.generated.Rotates_bfm_jmhTest::bfm_avgt_jmhStub,

             0x0000ffff84584dac:   add    x11, x14, #0x94

             0x0000ffff84584db0:   stp    x21, x19, [sp]
             0x0000ffff84584db4:   stp    x20, x14, [sp, #16]
             0x0000ffff84584db8:   stp    x15, x10, [sp, #32]
             0x0000ffff84584dbc:   str    x11, [sp, #48]
             0x0000ffff84584dc0:   b    0x0000ffff84584dd8
             0x0000ffff84584dc4:   nop
             0x0000ffff84584dc8:   nop
             0x0000ffff84584dcc:   nop
   3.64%  ↗  0x0000ffff84584dd0:   str    x19, [sp, #16]
   0.07%  │  0x0000ffff84584dd4:   mov    x16, x29
          │  0x0000ffff84584dd8:   ldr    w10, [x16, #12] ;*invokestatic bfm
   3.92%  │  0x0000ffff84584ddc:   ldr    w12, [x16, #24]
   4.69%  │  0x0000ffff84584de0:   and    w2, w10, #0xff
   0.03%  │  0x0000ffff84584de4:   mov    x29, x16
   0.02%  │  0x0000ffff84584de8:   bfi    x2, x12, #8, #8     ;*ior 
{reexecute=0 rethrow=0 return_oop=0}
          │                                                   ; - 
org.openjdk.Rotates::bfm at 19 (line 23)


[1] http://cr.openjdk.java.net/~bulasevich/8249893/webrev.01/jdk-jdk.patch


More information about the hotspot-compiler-dev mailing list