[jmm-dev] Does a volatile load have to see the exact volatile store to synchronize?

Mon Apr 30 20:27:33 UTC 2018

I can't easily quantify the cost. I explicitly added some experts.

I believe that we would essentially have to stop using weak fences in the
volatile implementation, and just use heavy-weight syncs. Thus a load would
need two heavy-weight syncs and a store one, or the other way around.

I was hoping that anyone with an actual use case for the stronger version
would speak up. We do have some anecdotal evidence that it doesn't matter
much:

1) AFAICT, all Power implementations have always been broken w.r.t. the
current spec, and nobody has complained.

2) The standard de-facto programming model for Java consists mainly of DRF
+ hacks for lazy initialization of immutable data (+ maybe incorrect
hacks). At least the first two of those are unaffected.

3) This is a divergence from C++ that AFAIK has never really been
discussed. I suspect the only people who realized that are part of this
discussion.

I certainly share your concern about weakening this after the fact. OTOH,
we're not breaking portable code any more than it was already broken by
existing implementations.

I think the issue here is not just Power; it's also about constraints on
future processor designs (and possibly on software DSM implementations). As
much as I dislike non-multi-copy-atomic architectures for reasoning about
as a programmer, a lot of architects seem to believe that they are likely
to stay in some form, for highly parallel systems. Requiring
synchronization between threads that don't directly communicate seems to be
inherently questionable, at least without a strong use case.

Hans

On Fri, Apr 27, 2018 at 6:27 PM, Brian Demsky <bdemsky at uci.edu> wrote:

> Hi Hans,
>
> Do you have an estimate of how much it would actually slow down Java on
> Power to implement the spec?  Or good reason to believe that code doesn’t
> rely on the specified behavior?
>
> Brian
>
> > On Apr 27, 2018, at 6:07 PM, Hans Boehm <boehm at acm.org> wrote:
> >
> > [ This was previously posted to a smaller audience. Reposting here as the
> > next step. ]
> >
> > This seems to be a new Java memory model problem uncovered in response to
> > the revision of "release sequences" in C++. wg21.link/P0982 has details.
> > But if you don't care about the C++ memory model, you can ignore all that
> > and just read the following.
> >
> > Clearly this isn't the only or most serious open Java memory model
> problem.
> > But I think it's actually one that has a fairly simple point solution.
> And
> > it may be worth fixing without a comprehensive solution.
> >
> > Problematic litmus test:
> >
> > Writing =rlx for ordinary Java memory accesses and =sc for volatile ones,
> > consider
> >
> > Thread 1:
> > x =plain 1;
> > v =vol 1;
> >
> > Thread 2:
> > v =vol 2;
> >
> > Thread 3:
> > r1 =vol v;
> > r2 =plain x;
> >
> > Java disallows the final state, after joining all threads, of r1 = v = 2
> > and r2 = 0. Since in the end v = 2, Thread 2s assignment to v must have
> > followed Thread 1's  in the synchronization order. And in Java a volatile
> > store synchronizes with all later (in synchronization order) volatile
> loads
> > (Property A). This Thread 1 must synchronize with Thread 3, and r2 must
> be
> > 1.
> >
> > This diverges from the analogous C++ semantics. (The release sequence
> > problem there is a bit different.)
> >
> > The consensus of the experts in the other discussion is that this outcome
> > is in fact allowed on Power, with both of the standard compilation
> models.
> > Thus the spec and the implementations can't both be right in this regard.
> >
> > IIRC, the JMM discussion that led to this, like the one that led to the
> > vaguely analogous C++ problem, was more of a "why not" argument then
> > anything solid. Which in retrospect was probably unwise in both cases.
> > That, combined with the fact that this is a C++ vs Java divergence, and
> the
> > expense of actually conforming to the current spec on Power, suggests we
> > may want to call this a spec problem.
> >
> > The concrete proposal would be to change the bullet (in 17.4.4)
> >
> > * A write to a volatile variable v (§8.3.1.4) synchronizes-with all
> > subsequent reads of v by any thread (where "subsequent" is defined
> > according to the synchronization order).
> >
> > to (for now)
> >
> > * A write w to a volatile variable v (§8.3.1.4) synchronizes-with any
> read
> > of v that observes the value written by w.
> >
> > The reason I said "for now" is that I think we will eventually need
> > C++-style "release sequences" in order to prevent intervening RMW
> > operations from breaking the synchronizes with relationship here. Without
> > that some fairly basic idioms, like reference counting, would look
> > different in Java and C++, with Java being needlessly slower. But RMW
> > operations aren't yet a thing in the JLS, so we can leave that in the
> > bucket of other things that will eventually need fixing.
> >
> > The argument for doing this now rather than later is that the spec
> clearly
> > promises something that fails to hold for major implementations. And
> > somewhat uniquely, in this case, we do know how to fix it. There is no
> > reason to provide misleading information here.
> >
> > Opinions?
> >
> > Hans
> >
>
>