[jmm-dev] jdk9 APIs [Fences specifically]

Sat Aug 15 01:14:32 UTC 2015

On Fri, Aug 14, 2015 at 5:14 PM, Doug Lea <dl at cs.oswego.edu> wrote:
>
> I suppose we should stop arguing about terminology/domains but ...
> Ensuring that the writes are not reordered (as seen
> by other threads) seems to be a prerequisite in all the
> cases under consideration here. So at least some of us
> conclude that solutions should build off this (fence) constraint.

You're now talking about preventing reordering of accesses to
the same variable?  I do agree that such a fence could conceivably
make sense in Java, though I'm not sure whether there are
enough use cases to justify them.  And I suspect you would want
a generic sameVariableFence rather than distinguishing between
loads and stores.

This doesn't make sense in C++ since, thanks largely to Paul,
memory_order_relaxed accesses are implicitly cache coherent.

I also still think this issue is quite distinct from the problem we
started with, as evidenced by the problem that merging of
variable accesses is definitely still an issue in C++, and
sameVariableFence wouldn't prevent it.

>
> On 08/14/2015 02:36 PM, Hans Boehm wrote:
>>
>> As you point out, that's kind of the domain of C-style volatiles,
>> though they really only address the "combining" part.
>
>
> The use of (C) volatile casts amounts to telling the compiler:
> "I cannot tell you why to write/read this variable; just
> do it".  If that's the best available solution, in Java, we could
> add analogous VarHandle methods of the form:
>   void setOpaquely(Object owner, T x);
>   T getOpaquely(Object owner);
>
> I don't like this at all. But to defend anyway: they do form a
> sort of matched set with the reachabilityFence
> implementation as "useOpaquely(ref)". And hotspot has an
> analogous internal (misnamed) construct MemBarCPUOrder that means:
> "I cannot tell you why not to reorder these IR nodes; just don't".
> This would come into play in the implementation of these methods
> as the "solution"  to the lack of explicit storeStore fence.
> So this is entirely workable even if entirely unappealing.

In C and C++ we could somewhat define such constructs in terms
of the behavior of asynchronous signal handlers running in the same
thread.  We do that for async_signal_fence, which is essentially
intended to be a compiler fence.  I don't think
there's really a Java equivalent.

I think the actual definition of "volatile" in C and C++ is entirely
unsatisfactory.  The real definition has become "it does whatever
it used to do on this architecture".  C has enough history in this
area that that makes sense.  That's not a Java solution, though
I would prefer some kind of "compiler don't muck with this"
access to loadLoad or storeStore fences.  I'd personally prefer
to spell it differently, e.g. as an @Races variable annotation.

>
>>  >
>>  > But even if so, it seems better to have a uniform API:
>>  >   writes:  full > release > storeStore
>>  >   reads:   full > acquire > loadLoad
>>  > even if loadLoad is internally mapped to acquire.
>>  > And omitting it feels even more wrong if we support
>>  > RCU-like usages with scoped loadLoadFence(Object ref).
>>
>> C++ has a great solution to that: include neither storeStore
>> nor loadLoad.
>
>
> Some history: In jdk8, after a similar discussion, we added only
> the three C++ fences as internal Unsafe intrinsics (which in turn
> map to existing MemBar IR nodes inside hotspot). But some JVM
> engineers voted with their feet about these: They introduced
> internal storeStore so they could use it (and/or directly
> generate) instead of release when applicable. The fact that
> it is cheaper or no more expensive than alternatives on most if not
> all ARM processors is surely the main reason for use. But
> other use cases are not just motivated by cost.
Android has StoreStore internally as well, though I'm starting to
wonder whether we should get rid of it, as we did with LoadLoad.

Given the uncertainty about correct use cases, I'd certainly
prefer to keep it at most internal.

I agree that engineers voting with their feet is often a strong argument
for adding a feature.  But I think those uses need to become with
convincing correctness arguments and reasonable specifications.
I think in this case we have neither.  In my opinion, there have been
plenty of cases of engineers heading into dead ends as well.

>
> It would be nice to avoid similar problems with jdk9 public APIs.
> (This accounts for my choice of words in initial post
> that it "seems wise" to include them.)
>
>> AFAICT, storeStore is a feature we don't know how to specify,
>
>
> Again, the spec is conceptually easy: Of all the ordering
> effects allowed by the base memory model rules, disable
> the indicated case of store-store reordering.
Unfortunately, I think  it's not that simple because of transitivity
issues.  If the ordering doesn't correctly combine with the reads-from
ordering, then the expected guarantees from e.g. publication safety
go away when more than two threads are involved.  And everything
is complicated since this only matters for architectures that are
not multi-copy atomic and can't be described by simple operation
reordering.

The question is whether the fact that my load saw
a store S before a StoreStore fence means anything about the visibility of S
to a third thread.  I think that without some guarantee along those
lines, this is essentially useless, as in my example in the last message.
(Dependencies also figure into that example, and may be
critical, but we already know that we don't know how to handle those.)

>
> The hard part is specifying the conditions under which usage
> does what any given programmer expects. Moreso than other fences.
> But this does not seem to be a good enough reason to omit from API.
I do not know of a definition that's good enough that I would be comfortable
using them.  Which I think is a good reason for omitting them.

Hans

>
> -Doug
>