[jmm-dev] bitwise RMW operators, specifically testAndSetBit/BTS

Mon Jul 25 18:19:05 UTC 2016

Just to make sure we're clear here.  The differences between Opaque and
Plain seem to be:

1. Opaque is cache coherent (i.e. single-variable sequentially consistent),
just like memory_order_relaxed in C++.  This means that Opaque will
generate different instructions on architectures that don't promise cache
coherence by default. (Currently probably just Itanium, but hardware
architects seem to eventually want to apply similar optimizations to
compilers.)

2. Opaque prevents compiler merging of accesses, which probably makes it
more like volatile atomic<T> in C++.  (WG21/SG1 has been discussing some
related restrictions on non-volatile atomics, but they haven't gone
anywhere. Certainly C++17 is unlikely to say anything here. From my
perspective, C++ "volatile" really seems to be more defined by processor
ABIs than the language standard, for the reasons Andrew mentioned.
Standard-conforming programs usually can't tell conclusively whether the
rules are being followed, but low-level systems programs can.)

In my mind, (2) is separable from coherence.

The intent would be to strengthen (Java) volatile, etc., so they are
strictly stronger than Opaque? Currently I don't think there is a guarantee
that a bounded spin loop using volatiles can't be collapsed to a no-op.
Presumably no reasonable implementations actually do that, however.  I have
no idea whether there are implementations that merge a pair of volatile
loads.

On Mon, Jul 25, 2016 at 7:28 AM, Doug Lea <dl at cs.oswego.edu> wrote:

> On 07/25/2016 09:50 AM, Andrew Haley wrote:
>
>> Well, OK, but I'm trying to think of one case where a Java program
>> could tell the difference between the two, and I'm coming up empty.
>>
>
> Oh, sorry for not including some. Using Point and PX VarHandle for Point.x:
>
> 1. Unbounded spin:
>   while (PX.getOpaque(a) == 0) ;
>
> Note that programmers would normally use getAcquire or getVolatile
> here, but the question remains even if they don't.
>
> Can this be transformed into conditional infinite spin? As in:
>   if (PX.getOpaque(a) == 0) for (;;) ;
> Not if coherence is defined to entail progress.
>
> 2. Bounded spin:
>   long i = 1000;
>   while (PX.getOpaque(a) == 0 && --i > 0) ;
>
> Can this be optimized into a no-op? What if i = Long.MaxValue?
> Under coherence, an implementation would have to establish some
> maximum bound K for merges to decide if/when to do this.
> In which case the best option is for the spec to say that K must
> be exactly one (i.e., no merges) for the sake of definitiveness.
>
> -Doug
>
>
>
>
>