RFR: 8217874: Shenandoah: AArch64: Clobbered register in ShenandoahBarrierSetAssembler::cmpxchg_oop()
Roman Kennke
rkennke at redhat.com
Mon Feb 11 13:32:11 UTC 2019
>>>>> In AArch64, when called from C2, in
>>>>> ShenandoahBarrierSetAssembler::cmpxchg_oop() the result register may
>>>>> overlap with other input argument registers and thus fail the leading
>>>>> assert, and lead to clobbered registers. In the body of the code block,
>>>>> a temporary register should be used instead, and result should only get
>>>>> filled in at the end.
>>>
>>> That looks convincing except that the original code avoided using
>>> rscratch1. Was there a reason for that?
>>>
>>> It may be that way because the original cmpxchg code that this is
>>> derived from allowed that callers might be holding something in
>>> rscratch1 across the cmpxchg call. Will that ever happen for your usage
>>> in the barrier set?
>>
>> I believe this is before we used rscratch1 for holding the result in
>> case of not generating for compare-and-exchange, but this changed with:
>> http://hg.openjdk.java.net/jdk/jdk/rev/74a5ef4c81cc
>
> Ok, so using rscratch1 looks fine then.
>
>>> If you are sure that this won't happen and that all the gubbins you call
>>> from the barrier cmpxchg_oop implementation doesn't use rscratch1 then I
>>> think this is ok.
>>
>> Yeah I'm pretty sure.
>>
>>> Of course, if the result/input overlap is only ever happening in calls
>>> from C2 then you could fix this more simply in the C2 rules. Allocate a
>>> temp register in each rule that encodes a call to cmpxchg_oop, pass that
>>> temp register in as the 'res' argument and then plant a move to the
>>> output register after the call to cmpxchg_oop.
>>
>> I thought about that too. But then, why allocate an additional register,
>> when we can have one for free?
> Yes, using rscratch1 is better because it avoids an extra register
> allocate. So, your patch is reviewed.
Ok, thanks for reviewing!
Roman
More information about the shenandoah-dev
mailing list