[jmm-dev] bitwise RMW operators, specifically testAndSetBit/BTS

Tue Jul 26 19:26:36 UTC 2016

I'm not quite sure which document you're referring to for C++.  The latest
draft (N4604 or N4606)  reorganized section 1.10.

1.10.2 discusses forward progress in a lot more detail than before. But I
think the only directly relevant statement here is p18, which was there
before:

"An implementation should ensure that the last value (in modification
order) assigned by an atomic or
synchronization operation will become visible to all other threads in a
finite period of time."

Recall that "should" (as opposed to e.g. "shall") is ISO standardese for a
non-binding recommendation.  The reason I haven't pushed for something
stronger is that I don't think hardware specifications consistently contain
the corresponding guarantees, which would put language implementers in a
weird position. But that could probably be argued either way.

This is now separate from the core memory model in 1.10.1.

I think the "no merge" rule is not really formally specifiable, since it's
a compiler-only constraint that can't be tested by a conforming program.
We could specify a "no infinite merge" rule that handles the unbounded spin
case on reasonable hardware.

As I'm occasionally reminded by my WG21 colleagues, it's not clear that the
extreme cases here are worth spending too much time on, since nobody is
going to use an implementation that gets them wrong, no matter what we say.
The tricky and more interesting cases are probably something like:

l.my_spin_lock();  // Implemented with acquire CAS
if (...) {
   ...
   l.my_spin_unlock();  // release store
} else {
   ...
   l.my_spin_unlock();
   ...  // No synchronization; Known to terminate in bounded time
}

Can I move the unlock release store out of the conditional to merge the two
stores?

On Mon, Jul 25, 2016 at 12:24 PM, Doug Lea <dl at cs.oswego.edu> wrote:

> On 07/25/2016 02:19 PM, Hans Boehm wrote:
>
> 1. Opaque is cache coherent (i.e. single-variable sequentially
>> consistent), just
>> like memory_order_relaxed in C++.
>>
>> 2. Opaque prevents compiler merging of accesses,
>>
>> In my mind, (2) is separable from coherence.
>>
>
> This might not be the right venue to discuss whether the new C++17 sec
> 1.10.4
> progress requirements apply to the memory system. I think they must, and
> that this would be consistent with common formal cache-memory-system specs.
>
> In which case you are inevitably led to the no-merge rule, as seen in the
> examples I posted.
>
> And even if this were not done in C++, I don't know any argument for
> not doing so in Java. No programmer would be happy if their bounded
> spin loops were allowed to be transformed into no-ops. Why allow
> something that literally no one wants rather than just hoping that
> compilers don't happen to do it?
>
> (Gratuitously editorializing, one would think that in C++,
> it might also be popular to adopt this interpretation, and
> eliminate the need to ever integrate C "volatile", or to
> re-spec consume mode.)
>
> -Doug
>
>
>