Single byte Atomic::cmpxchg implementation
Erik Österlund
erik.osterlund at lnu.se
Thu Sep 11 11:30:25 UTC 2014
On 11 Sep 2014, at 03:25, David Holmes <david.holmes at oracle.com> wrote:
> The Atomic operations must provide full bi-directional fence semantics, so a full sync on entry is required in my opinion. I agree that the combination of bne+isync would suffice on the exit path.
I see no reason for the atomic operations to support more than full acquire and release (hence sequential consistency) memory behaviour as well as atomic updates.
For this, I see no reason why a full sync rather than lwsync is required (for the write barrier). The XNU kernel implementation also uses lwsync for release semantics and isync for the acquire.
Why would this be different for us? From the XNU kernel (note the choice of fences I argue for):
compare_and_swap32_on64b: // bool OSAtomicCompareAndSwapBarrier32( int32_t old, int32_t new, int32_t *value);
lwsync // write barrier, NOP'd on a UP
1:
lwarx r7,0,r5
cmplw r7,r3
bne-- 2f
stwcx. r4,0,r5
bne-- 1b
isync // read barrier, NOP'd on a UP
li r3,1
blr
2:
li r8,-8 // on 970, must release reservation
li r3,0 // return failure
stwcx. r4,r8,r1 // store into red zone to release
blr
> But this is a complex area, involving hardware that doesn't always follow the rules, so conservatism is understandable.
As far as wrong hardware goes, I don't know what to do about that but can we confirm that there is hardware not doing the fences according to specification in particular?
It becomes very difficult to respect incorrect hardware implementations in my opinion.
> But this needs to be taken up with the PPC64 folk who did this port.
I agree, it would be very helpful to hear the perspective of the ones who wrote our implementation.
/Erik
More information about the hotspot-dev
mailing list