[aarch64-port-dev ] RFR: 8150394: aarch64: add support for 8.1 LSE CAS instructions
Andrew Dinn
adinn at redhat.com
Wed Feb 24 16:50:58 UTC 2016
On 24/02/16 10:58, Andrew Haley wrote:
> On 22/02/16 20:32, Edward Nevill wrote:
>> http://cr.openjdk.java.net/~enevill/8150394/webrev.0/
>>
>> This adds support for the CAS instructions in armv8.1.
>
> The C2 code for aarch64_enc_cmpxchg* is missing.
>
> It's quite tricky to refactor to allow LSE instructions. I'd add
> a wordsize parameter to the cas instruction, like this:
>
> #define INSN(NAME, a, r) \
> void NAME(operand_size sz, Register Rs, Register Rt, Register Rn) { \
> assert(Rs != Rn && Rs != Rt, "unpredictable instruction"); \
> compare_and_swap(Rs, Rt, Rn, sz, 1, a, r); \
> }
> INSN(cas, 0, 0)
>
> And this gets rid of a ton of instruction definitions: we only need
> CAS{A,L,AL}.
>
> Pass the operand size down to MacroAssembler::cmpxchgw:
>
> enc_class aarch64_enc_cmpxchgw(memory mem, iRegINoSp oldval, iRegINoSp newval) %{
> MacroAssembler _masm(&cbuf);
> guarantee($mem$$index == -1 && $mem$$disp == 0, "impossible encoding");
> __ cmpxchg(Assembler::word, $mem$$base$$Register, $oldval$$Register,
> $newval$$Register,
> &Assembler::ldxrw, &MacroAssembler::cmpw, &Assembler::stlxrw);
> %}
>
> void MacroAssembler::cmpxchgw(operand_size sz, Register oldv,
> Register newv, Register addr, Register tmp,
> Label &succeed, Label *fail) {
>
> if (UseLSE) {
> ...
>
> It'll be necessary to pass a memory barrier flag too.
You mean to deal with the difference between aarch64_enc_cmpxchg and
aarch64_enc_cmpxchg_acq? The former uses ldxr and is employed for CAS
when UseBarriersForVolatile is true. The latter uses ldaxr and is
employed when we optimize CAS because UseBarriersForVolatile is false.
We need to use the relevant flavour of casxx iside cmpxchg for each of
these two encodings.
I was also going to recommend using LSE in cmpxchg but I was not sure
exactly how it would need to work. The lock code does not loop when the
stlxr fails (it branches to cas_failed). However the CAS code loops back
to retry the load. If cmpxchg is rewritten to use casal (or casl) does
it not still need to loop?
Also, what does casxx allow us to do to implement the weaker variants of
the new unsafe CAS API other than to include or exclude the acquire? Is
there a variant of CAS operations which could use casa rather than
casal? or even just cas?
regards,
Andrew Dinn
-----------
More information about the aarch64-port-dev
mailing list