RFR(M): 8154736: enhancement of cmpxchg and copy_to_survivor for ppc64

David Holmes david.holmes at oracle.com
Sat Nov 5 17:48:29 UTC 2016


Hi Andrew,

Sorry for the delayed response. I think I should start a new thread to 
discuss Atomic::r-m-w memory semantics.

David

On 1/11/2016 7:44 PM, Andrew Haley wrote:
> On 31/10/16 21:30, David Holmes wrote:
>>
>>
>> On 31/10/2016 7:32 PM, Andrew Haley wrote:
>>> On 30/10/16 21:26, David Holmes wrote:
>>>> On 31/10/2016 4:36 AM, Andrew Haley wrote:
>>>>>
>>>>> And, while we're on the subject, is memory_order_conservative actually
>>>>> defined anywhere?
>>>>
>>>> No. It was chosen to represent the current status quo that the Atomic::
>>>> ops should all be (by default) full bi-directional fences.
>>>
>>> Does that mean that a CAS is actually stronger than a load acquire
>>> followed by a store release?  And that a CAS is a release fence even
>>> when it fails and no store happens?
>>
>> Yes. Yes.
>>
>>    // All of the atomic operations that imply a read-modify-write
>>    // action guarantee a two-way memory barrier across that
>>    // operation. Historically these semantics reflect the strength
>>    // of atomic operations that are provided on SPARC/X86. We assume
>>    // that strength is necessary unless we can prove that a weaker
>>    // form is sufficiently safe.
>
> Mmmm, but that doesn't say anything about a CAS that fails.  But fair
> enough, I accept your interpretation.
>
>> But there is some contention as to whether the actual implementations
>> obey this completely.
>
> Linux/AArch64 uses GCC's __sync_val_compare_and_swap, which is specified
> as a
>
>   "full barrier".  That is, no memory operand is moved across the
>   operation, either forward or backward.  Further, instructions are
>   issued as necessary to prevent the processor from speculating loads
>   across the operation and from queuing stores after the operation.
>
> ... which reads the same as the language you quoted above, but looking
> at the assembly code I'm sure that it's really no stronger than a seq
> cst load followed by a seq cst store.
>
> I guess maybe I could give up fighting this and implement all AArch64
> CAS sequences as
>
>    CAS(seq_cst); full fence
>
> or, even more extremely,
>
>    full fence; CAS(relaxed); full fence
>
> but it all seems unreasonably heavyweight.
>
>>> And that a conservative load is a *store* barrier?
>>
>> Not sure what you mean. Atomic::load is not a r-m-w action so not
>> expected to be a two-way memory barrier.
>
> OK.
>
> Thanks,
>
> Andrew.
>


More information about the ppc-aix-port-dev mailing list