[jmm-dev] weakCompareAndSet memory semantics

Sat Apr 23 07:31:42 UTC 2016

On 23.04.2016 01:28, David Holmes wrote:
> Agreed - perfectly valid usecase for weakCAS, and weakCAS has a benefit
> over CAS. But that is not the inner-loop re-fetch versus outer-loop
> re-fetch that I was questioning from Aleksey's original email.

So, it seems like you are questioning the remarks in C++11 standard,
29.6.5/25:

"Remark: A weak compare-and-exchange operation may fail spuriously. That
is, even when the contents of memory referred to by expected and object
are equal, it may return false and store back to expected the same
memory contents that were originally there. [ Note: This spurious
failure enables implementation of compare-and-exchange on a broader
class of machines, e.g., load-locked store-conditional machines. A
consequence of spurious failure is that nearly all uses of weak compare-
and-exchange will be in a loop. When a compare-and-exchange is in a
loop, the weak version will yield better performance on some platforms.
When a weak compare-and-exchange would require a loop and a strong one
would not, the strong one is preferable. — end note ]"

Or, in more humane form:

https://herbsutter.com/2012/08/31/reader-qa-how-to-write-a-cas-loop-using-stdatomics/

Under heavy contention (which *is* the use case, whether you want it or
not), the CAS loop performance heavily depends on the width of the
"collision window" for RMW operation. The wider the gap in-between R-M-W
operations there, then more chances to fail the impending CAS. This is
directly measurable even on modern heavily-pipelined x86:
  https://bugs.openjdk.java.net/browse/JDK-8141640

...and looped emulation of strong CAS with LL/SC sets you in the similar
position, where the excess work within the LL/SC -> CAS emulation widens
the collision window. It is cheaper to retry in the main loop even on
spurious failure, because it shortens the collision window. (This is the
third way I can explain this issue.)

Thanks,
-Aleksey