RFR(S): 8136820 Generate better code for some Unsafe addressing patterns

Vladimir Kozlov vladimir.kozlov at oracle.com
Tue Sep 22 13:38:09 UTC 2015


On 9/22/15 8:59 PM, Roland Westrelin wrote:
>> So the main fix is Op_AddI --> Op_AddX change in loopopts.cpp. I did not noticed it in all spaces removal there.
>
> Yes.
>
>> And additional code in matcher when !off->is_Con(). What is 'off' in your case?
>
> LShiftL = (((long) i) << 2)

Okay.

>
>> What is change in superword.cpp?
>
> When testing the change I hit a bug where superword would use a ConvI2L node where an integer node was expected (in an arithmetic operation I think). I traced the execution and I found that SW would follow a chain of AddP and stop at a ConvI2L. I assumed it was related to the loop opt change and that that graph pattern had never been seen before. I can dig more details if you like.

I see. Jan Civlin from Intel told me that there are problems with 
invariants in SW. Could be on those problems.

Changes are good.

Thanks,
Vladimir

>
> Roland.
>
>>
>> Thanks,
>> Vladimir
>>
>> On 9/22/15 7:48 PM, Roland Westrelin wrote:
>>> Thanks for looking at this, Vladimir.
>>>
>>>> It would be nice if you describe your changes (RFE also has nothing). Especially changes in matcher.cpp. From what I can understand changes are good but please explain what you did.
>>>
>>> Sorry about that.
>>>
>>> The address of the unsafe access is: (array + ((((long) i) << 2) + base)) which my change reshapes into: (array + base) + (((long) i) << 2) so array+base can be computed once for all because both array and base are loop invariant. The ad file change is so the entire new_base + (((long) i) << 2) address computation is folded into an x86 addressing mode (including the int to long cast). The matcher change ensure that if the same address computation is used by several memory accesses, the address is not computed once in a register and then shared but rather embedded in each memory access instruction as we do for other memory addressing modes.
>>>
>>> Roland.
>>>
>>>>
>>>> Thanks,
>>>> Vladimir
>>>>
>>>> On 9/21/15 6:00 PM, Roland Westrelin wrote:
>>>>> http://cr.openjdk.java.net/~roland/8136820/webrev.00/
>>>>>
>>>>> The main loop in:
>>>>>
>>>>>      static int[] array;
>>>>>      static long base;
>>>>>      static int test1() {
>>>>>          int res = 0;
>>>>>          for (int i = 0; i < 100; i++) {
>>>>>              long address = (((long) i) << 2) + base;
>>>>>              res += UNSAFE.getInt(array, address);
>>>>>          }
>>>>>          return res;
>>>>>      }
>>>>>
>>>>> is compiled as:
>>>>>
>>>>> 0b0 B2: # B2 B3 <- B1 B2 Loop: B2-B2 inner Freq: 101.492
>>>>> 0b0 movslq R9, R10 # i2l
>>>>> 0b3 salq R9, #2
>>>>> 0b7 addq R9, [R11 + #112 (8-bit)] # long
>>>>> 0bb addl RAX, [R8 + R9] # int
>>>>> 0bf incl R10 # int
>>>>> 0c2 cmpl R10, #100
>>>>> 0c6 jl,s B2 # loop end P=0.990147 C=6732.000000
>>>>>
>>>>> but could be compiled as:
>>>>>
>>>>> 0b2 B2: # B2 B3 <- B1 B2 Loop: B2-B2 inner Freq: 101.492
>>>>> 0b2 addl RAX, [R8 + pos R10 << #2] # int
>>>>> 0b6 incl R10 # int
>>>>> 0b9 cmpl R10, #100
>>>>> 0bd jl,s B2 # loop end P=0.990147 C=6732.000000
>>>>>
>>>>> base and array are loop invariant so array + base can be computed before the loop.
>>>>>
>>>>> Roland.
>>>>>
>>>
>


More information about the hotspot-compiler-dev mailing list