[aarch64-port-dev ] 8133352: aarch64: generates constrained unpredictable instructions

Wed Aug 19 16:33:49 UTC 2015

Looks fine to me.

I did not see any comments from our colleges (from RH) who works on 
arm64. Are they agree with this change?

Thanks,
Vladimir

On 8/18/15 6:07 AM, Edward Nevill wrote:
> Hi,
>
> Given that there has been no objections to my proposed solution I have prepared a webrev based on this.
>
> http://cr.openjdk.java.net/~enevill/8133352/webrev.01
>
> The original jira issue is here
>
> https://bugs.openjdk.java.net/browse/JDK-8133352
>
> I have tested with jtreg hotspot and langtools. Results before and after were identical.
>
> Hotspot: Test results: passed: 883; failed: 2; error: 10
> Langtools: Test results: passed: 3,260; failed: 2
>
> Please review and if OK I will push,
>
> Thanks,
> Ed.
>
> On Wed, 2015-08-12 at 17:23 +0100, Edward Nevill wrote:
>> On Tue, 2015-08-11 at 09:55 -0700, Vladimir Kozlov wrote:
>>> I think it depends how expensive push/pop on arm64.
>>> In c2 generated code you may introduced spills to stack around GetAndAdd code since you use additional register (in
>>> .ad). So you are saving on stack anyway.
>>> On other hand your changes (third temp) are not so big and I think acceptable.
>>> On 8/11/15 8:57 AM, Edward Nevill wrote:
>> -#define ATOMIC_OP(LDXR, OP, STXR)                                       \
>> +#define ATOMIC_OP(LDXR, OP, IOP, STXR)                                       \
>>   void MacroAssembler::atomic_##OP(Register prev, RegisterOrConstant incr, Register addr) { \
>>     Register result = rscratch2;                                          \
>>     if (prev->is_valid())                                                 \
>> @@ -2120,14 +2125,15 @@
>>     bind(retry_load);                                                     \
>>     LDXR(result, addr);                                                   \
>>     OP(rscratch1, result, incr);                                          \
>> -  STXR(rscratch1, rscratch1, addr);                                     \
>> -  cbnzw(rscratch1, retry_load);                                         \
>> -  if (prev->is_valid() && prev != result)                               \
>> -    mov(prev, result);                                                  \
>> +  STXR(rscratch2, rscratch1, addr);                                     \
>> +  cbnzw(rscratch2, retry_load);                                         \
>> +  if (prev->is_valid() && prev != result) {                             \
>> +    IOP(prev, rscratch1, incr);                                         \
>> +  }                                                                     \
>>   }
>>
>> -ATOMIC_OP(ldxr, add, stxr)
>> -ATOMIC_OP(ldxrw, addw, stxrw)
>> +ATOMIC_OP(ldxr, add, sub, stxr)
>> +ATOMIC_OP(ldxrw, addw, subw, stxrw)
>>
>> This essentially creates the extra register we need by using the inverse operation to restore the result.
>>
>> It doesn't win any beauty contests, but it is probably the most optimal.
>
>
>