From aleksey.shipilev at oracle.com Thu Jun 16 20:28:05 2016
From: aleksey.shipilev at oracle.com (Aleksey Shipilev)
Date: Thu, 16 Jun 2016 23:28:05 +0300
Subject: [jmm-dev] Specifying VarHandle acquire/release without ill effects
Message-ID: <57630BD5.60004@oracle.com>
Hi,
Our current VarHandle.{getAcquire|setRelease} are oddly specified:
/*
* Returns the value of a variable, and ensures that subsequent
* loads and stores are not reordered before this access.
* ...
*/
Object getAcquire(Object... args);
/**
* Sets the value of a variable to the {@code newValue}, and ensures
* that prior loads and stores are not reordered after this access.
* ...
*/
void setRelease(Object... args);
The clause about reorderings trips users into believing that getAcquire
and putRelease have fence-like semantics. But in reality, they are
weaker than fences, because optimizers are not obliged to introduce
barriers around the operation to get the needed semantics. This opens up
optimization opportunities that we would not like to miss with
over-specifying VarHandles.
For example, we do not specify volatile semantics in terms of
reorderings, because volatile accesses and associated barriers should be
completely removable here:
@JCStressTest
@Outcome(id = "1, 0", expect = Expect.ACCEPTABLE,
desc = "No happens-before here.")
@Outcome( expect = Expect.ACCEPTABLE,
desc = "All other cases are acceptable.")
@State
public class VolatilesAreNotBarriers {
static class Holder {
volatile int GREAT_BARRIER_REEF;
}
int x;
int y;
@Actor
public void actor1() {
Holder h = new Holder();
x = 1;
h.GREAT_BARRIER_REEF = h.GREAT_BARRIER_REEF;
y = 1;
}
@Actor
public void actor2(IntResult2 r) {
Holder h = new Holder();
r.r1 = y;
h.GREAT_BARRIER_REEF = h.GREAT_BARRIER_REEF;
r.r2 = x;
}
}
(A similar example can be built with setRelease/getAcquire)
Current C2 does leave barriers behind, but that seem to be an
implementation inefficiency: while it purges both Holder instances and
volatile ops, it loses the association between the actual store and the
relevant barrier shortly after parsing.
My duct-taped Graal runs show that Graal seems to eliminate both
instances and associated barriers on x86 (disassembly shows no barriers,
and performance is 10x faster in actor methods). We certainly would not
like to throw compilers under the bus and say this is forbidden.
Therefore, I wonder if this is a better getAcquire specification
(setRelease is symmetric to that):
/**
* Returns the value of a variable to the {@code newValue}, with
* memory semantics similar to {@code volatile} variable load,
* but without total ordering.
*
*
Reads with this access mode are access atomic.
*
*
Previous writes to the same variable synchronize-with (and
* therefore happen-before) reads with this access mode, if writes
* are performed with
* {@code setVolatile}, {@code setRelease},
* {@code compareAndExchangeVolatile},
* {@code compareAndExchangeRelease},
* {@code compareAndSet},
* {@code weakCompareAndSetVolatile} or
* {@code weakCompareAndSetRelease} access modes.
*
*
Reads with this access mode are not part of total
* synchronization order. This makes {@code getAcquire} access
* mode weaker than {@code getVolatile} access mode.
*
* ...
*/
Object getAcquire(Object... args);
Or is there some hazard I am not seeing with spec like that? Apart from
a creepy disconnect with acq/rel participating in SW, but not in SO,
while SW is spec-ed as the suborder of SO.
Thanks,
-Aleksey
From aph at redhat.com Fri Jun 17 08:08:50 2016
From: aph at redhat.com (Andrew Haley)
Date: Fri, 17 Jun 2016 09:08:50 +0100
Subject: [jmm-dev] Specifying VarHandle acquire/release without ill
effects
In-Reply-To: <57630BD5.60004@oracle.com>
References: <57630BD5.60004@oracle.com>
Message-ID: <5763B012.9090500@redhat.com>
On 16/06/16 21:28, Aleksey Shipilev wrote:
> The clause about reorderings trips users into believing that
> getAcquire and putRelease have fence-like semantics. But in reality,
> they are weaker than fences, because optimizers are not obliged to
> introduce barriers around the operation to get the needed semantics.
I've read this several times now and I still do not understand it.
Release and acquire are well-defined terms in the industry, and I
can't think of any great reason to add more words like yours, which
seem to make things even more confusing than they were. The wording
which is there seems to me to be accurate, at least, as a description
of release and acquire.
Andrew.
From aleksey.shipilev at oracle.com Fri Jun 17 08:34:27 2016
From: aleksey.shipilev at oracle.com (Aleksey Shipilev)
Date: Fri, 17 Jun 2016 11:34:27 +0300
Subject: [jmm-dev] Specifying VarHandle acquire/release without ill
effects
In-Reply-To: <5763B012.9090500@redhat.com>
References: <57630BD5.60004@oracle.com> <5763B012.9090500@redhat.com>
Message-ID: <5763B613.4060801@oracle.com>
On 06/17/2016 11:08 AM, Andrew Haley wrote:
> On 16/06/16 21:28, Aleksey Shipilev wrote:
>
>> The clause about reorderings trips users into believing that
>> getAcquire and putRelease have fence-like semantics. But in reality,
>> they are weaker than fences, because optimizers are not obliged to
>> introduce barriers around the operation to get the needed semantics.
>
> I've read this several times now and I still do not understand it.
> Release and acquire are well-defined terms in the industry,
I dare anyone to produce the consistent industry definition of "acquire"
and "release" :)
> and I can't think of any great reason to add more words like yours,
> which seem to make things even more confusing than they were. The
> wording which is there seems to me to be accurate, at least, as a
> description of release and acquire.
That might be true if you take the *hardware* definition of
acquire/releases, where prohibiting reorderings is surely a plausible
semantics. (There is no telling if CPU vendors will some day describe
their hardware memory models with abstract rules, like high-level
languages do)
But the trouble is with allowing software optimizations, as volatile
example shows. A runtime should be able to elide barriers around
volatiles, if this elision does not violate JMM. In fact, it is
trivially doable when optimizers realize they are dealing with
thread-local volatile ops (e.g. on a non-escaped object, or in
constructor). This is what my original example shows.
Now, VarHandle's acquire/release can be seen as further _relaxations_ of
volatile semantics, when you give up sequential consistency for better
performance. Therefore, the barriers around acquire/releases should also
be elide-able. Therefore, specifying VarHandle's acquire/release with
reoderings is odd.
You can certainly specify that acq/rel inhibit reordering, but this way
we face even a weirder inconsistency. It means that volatiles are *more
optimizeable* than acq/rel, which defeats the purpose of having acq/rel
to begin with!
HTHS,
-Aleksey
From nitsanw at yahoo.com Fri Jun 17 13:02:18 2016
From: nitsanw at yahoo.com (Nitsan Wakart)
Date: Fri, 17 Jun 2016 13:02:18 +0000 (UTC)
Subject: [jmm-dev] Specifying VarHandle acquire/release without ill
effects
In-Reply-To: <5763B613.4060801@oracle.com>
References: <57630BD5.60004@oracle.com> <5763B012.9090500@redhat.com>
<5763B613.4060801@oracle.com>
Message-ID: <1537350078.233169.1466168538391.JavaMail.yahoo@mail.yahoo.com>
I think the effect you seek can be achieved by expanding on the acq/rel relationship. E.g.:
/*
* Returns the value of a variable, and ensures that subsequent
* loads and stores are not **VISIBLY** reordered before this
* access **when matched with a release of same variable** .
* ...
*/
Object getAcquire(Object... args);
/**
* Sets the value of a variable to the {@code newValue}, and ensures
* that prior loads and stores are not **VISIBLY** reordered after this
* access **when matched with an acquire of same variable** ..
* ...
*/
void setRelease(Object... args);
The wording is lame, but the intent is to make the guarantee depend on the relationship between acq and rel. If the relationship is void so is the guarantee.
This makes the optimization of TL variables, or any unmatched acq/rel if provable, valid.
From martinrb at google.com Sat Jun 18 00:40:18 2016
From: martinrb at google.com (Martin Buchholz)
Date: Fri, 17 Jun 2016 17:40:18 -0700
Subject: [jmm-dev] Specifying VarHandle acquire/release without ill
effects
In-Reply-To: <57630BD5.60004@oracle.com>
References: <57630BD5.60004@oracle.com>
Message-ID:
I looked again at the C++ standard, and it tries hard to specify everything
abstractly via the happens-before order and the total synchronization order
on volatiles. Which they probably more or less copied from java! It
should not be too hard to amend the Java memory model to incorporate the
concepts of "acquire operation" and "release operation" back from C++. We
already mostly have them! We simply need to introduce acquire operations
and release operations that are not reads/writes of volatiles.
The library API should then simply say things like "this is an acquire
operation" and link to the relevant section of the jls.
(except maybe for fences - those are special and isolated)
We don't know how to fix some aspects of the memory model, but bringing
acquire/release into the existing model (without talking about reorderings)
seems very achievable.
From dl at cs.oswego.edu Sat Jun 18 11:50:01 2016
From: dl at cs.oswego.edu (Doug Lea)
Date: Sat, 18 Jun 2016 07:50:01 -0400
Subject: [jmm-dev] Specifying VarHandle acquire/release without ill
effects
In-Reply-To: <5763B012.9090500@redhat.com>
References: <57630BD5.60004@oracle.com> <5763B012.9090500@redhat.com>
Message-ID: <57653569.40500@cs.oswego.edu>
On 06/17/2016 04:08 AM, Andrew Haley wrote:
> On 16/06/16 21:28, Aleksey Shipilev wrote:
>
>> The clause about reorderings trips users into believing that
>> getAcquire and putRelease have fence-like semantics.
You are right that we should clarify.
> I've read this several times now and I still do not understand it.
> Release and acquire are well-defined terms in the industry, and I
> can't think of any great reason to add more words like yours,
But maybe some other words. Backing up:
The rationale for the JMM JEP (http://openjdk.java.net/jeps/188)
was to provide rigorous specs for the C/C++11 analogs introduced in
jdk9 VarHandles (http://openjdk.java.net/jeps/193; see current API at
http://download.java.net/java/jdk9/docs/api/java/lang/invoke/VarHandle.html).
But we then discovered that doing this encountered some unsolved
research problems. Mainly that the current C/C++ model relies on rules
saying that programs with data races have no semantics, and does not
prohibit out-of-thin-air reads. And the existing JMM causality rules
that might help address this are known to be broken. Plus there are a few
minor-mismatch snags that do not seem hard to address, including that
plain Java variables are a little stronger than plain C/CC variables
(for example references cannot be word-torn), but weaker than atomics
in relaxed mode (they are not required to be coherent).
The hope (last year) was that progress would be made on these fronts, but
for the jdk-9 time frame, in the absence of a revised base memory model,
we should use careful English specs that convey intent that would someday
be backed up by a revised memory model.
There has been some progress on the underlying issues. See recent work by
Peter Sewell's group (http://www.cl.cam.ac.uk/~pes20/) on OOTA and
operational semantics, Allan Jeffrey (http://asaj.org/papers/lics16.pdf),
Jade Alglave (http://www0.cs.ucl.ac.uk/staff/J.Alglave/)
Viktor Vafeiadis (http://www.mpi-sws.org/~viktor/), and others.
So it is worth considering whether we can again move forward on
a major JMM revision. Comments would be welcome.
But in the mean time, we should do something to improve wordings that
Aleksey points out could mislead users and implementors in cases
where acquire or release access is optimized away or weakened
because it is thread-local, and so on. Minimally, we could add a
disclaimer that enables at least those transformations and optimizations
applying to volatiles. Maybe just:
The ordering constraints imposed by getAcquire and setRelease are weaker
than, and subsumed by those of getVolatile and setVolatile.
-Doug
From martinrb at google.com Mon Jun 20 17:48:31 2016
From: martinrb at google.com (Martin Buchholz)
Date: Mon, 20 Jun 2016 10:48:31 -0700
Subject: [jmm-dev] Specifying VarHandle acquire/release without ill
effects
In-Reply-To: <57653569.40500@cs.oswego.edu>
References: <57630BD5.60004@oracle.com> <5763B012.9090500@redhat.com>
<57653569.40500@cs.oswego.edu>
Message-ID:
On Sat, Jun 18, 2016 at 4:50 AM, Doug Lea wrote:
>
> But we then discovered that doing this encountered some unsolved
> research problems. Mainly that the current C/C++ model relies on rules
> saying that programs with data races have no semantics ... Plus there are
> a few
> minor-mismatch snags that do not seem hard to address, including that
> plain Java variables are a little stronger than plain C/CC variables
> (for example references cannot be word-torn), but weaker than atomics
> in relaxed mode (they are not required to be coherent).
>
My mental model is to map plain old java variables to C++ relaxed atomics,
except for long and double, which could perhaps be mapped to an array of
atomic bytes, technically eliminating all data races. Both C++ and Java
try to prevent OOTA results, and both succeed in practice, if not in theory.
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2013/n3710.html
From martinrb at google.com Mon Jun 20 18:14:06 2016
From: martinrb at google.com (Martin Buchholz)
Date: Mon, 20 Jun 2016 11:14:06 -0700
Subject: [jmm-dev] Specifying VarHandle acquire/release without ill
effects
In-Reply-To: <57653569.40500@cs.oswego.edu>
References: <57630BD5.60004@oracle.com> <5763B012.9090500@redhat.com>
<57653569.40500@cs.oswego.edu>
Message-ID:
With the proliferation of weakly ordered operations, our story on
happens-before and sequential consistency seems to have changed a little.
Our discussion on memory operations
http://download.java.net/java/jdk9/docs/api/java/util/concurrent/package-summary.html#MemoryVisibility
only discusses the happens-before relationship (a partial order), but the
memory operations being discussed (e.g. enqueueing and dequeueing from a
concurrent queue) actually satisfy a stronger condition - they are
sequentially consistent and form part of the global total order of all such
operations. It would not be OK to have general purpose data structure
update and read operations to have only release/acquire memory order.
I'm hoping that our definition of "happens-before" will be as close as
possible to the C++ definition, for the sanity of the software industry.
On Sat, Jun 18, 2016 at 4:50 AM, Doug Lea wrote:
>
> But in the mean time, we should do something to improve wordings that
> Aleksey points out could mislead users and implementors in cases
> where acquire or release access is optimized away or weakened
> because it is thread-local, and so on. Minimally, we could add a
> disclaimer that enables at least those transformations and optimizations
> applying to volatiles. Maybe just:
>
> The ordering constraints imposed by getAcquire and setRelease are weaker
> than, and subsumed by those of getVolatile and setVolatile.
From boehm at acm.org Fri Jun 24 04:45:47 2016
From: boehm at acm.org (Hans Boehm)
Date: Thu, 23 Jun 2016 21:45:47 -0700
Subject: [jmm-dev] Specifying VarHandle acquire/release without ill
effects
In-Reply-To:
References: <57630BD5.60004@oracle.com> <5763B012.9090500@redhat.com>
<57653569.40500@cs.oswego.edu>
Message-ID:
There is another important difference between C++ relaxed atomics and Java
ordinary variables. C++ relaxed atomics guarantee that accesses to a
single variable, independent of all other accesses, look sequentually
consistent. I.e. we promise "cache coherence". Java does not for ordinary
variables. And that's pretty critical for compiler optimization on the Java
side. (It also affects fencing on one or two hardware platforms.)
On Mon, Jun 20, 2016 at 10:48 AM, Martin Buchholz
wrote:
> On Sat, Jun 18, 2016 at 4:50 AM, Doug Lea wrote:
>
> >
> > But we then discovered that doing this encountered some unsolved
> > research problems. Mainly that the current C/C++ model relies on rules
> > saying that programs with data races have no semantics ... Plus there are
> > a few
> > minor-mismatch snags that do not seem hard to address, including that
> > plain Java variables are a little stronger than plain C/CC variables
> > (for example references cannot be word-torn), but weaker than atomics
> > in relaxed mode (they are not required to be coherent).
> >
>
> My mental model is to map plain old java variables to C++ relaxed atomics,
> except for long and double, which could perhaps be mapped to an array of
> atomic bytes, technically eliminating all data races. Both C++ and Java
> try to prevent OOTA results, and both succeed in practice, if not in
> theory.
> http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2013/n3710.html
>
From martinrb at google.com Sat Jun 25 00:21:50 2016
From: martinrb at google.com (Martin Buchholz)
Date: Fri, 24 Jun 2016 17:21:50 -0700
Subject: [jmm-dev] Specifying VarHandle acquire/release without ill
effects
In-Reply-To:
References: <57630BD5.60004@oracle.com> <5763B012.9090500@redhat.com>
<57653569.40500@cs.oswego.edu>
Message-ID:
On Thu, Jun 23, 2016 at 9:45 PM, Hans Boehm wrote:
> There is another important difference between C++ relaxed atomics and Java
> ordinary variables. C++ relaxed atomics guarantee that accesses to a
> single variable, independent of all other accesses, look sequentually
> consistent. I.e. we promise "cache coherence". Java does not for ordinary
> variables. And that's pretty critical for compiler optimization on the Java
> side. (It also affects fencing on one or two hardware platforms.)
>
Single variable cache coherence in C++ is unexpected and seems not in the
spirit of C++. I'm happy to keep that feature out of Java.
But I'm still a fan of how acquire/release and fences are defined in C++
(inspired by Java!) and hoping for some reciprocal inspiration.
>
> On Mon, Jun 20, 2016 at 10:48 AM, Martin Buchholz
> wrote:
>
>> On Sat, Jun 18, 2016 at 4:50 AM, Doug Lea wrote:
>>
>> >
>> > But we then discovered that doing this encountered some unsolved
>> > research problems. Mainly that the current C/C++ model relies on rules
>> > saying that programs with data races have no semantics ... Plus there
>> are
>> > a few
>> > minor-mismatch snags that do not seem hard to address, including that
>> > plain Java variables are a little stronger than plain C/CC variables
>> > (for example references cannot be word-torn), but weaker than atomics
>> > in relaxed mode (they are not required to be coherent).
>> >
>>
>> My mental model is to map plain old java variables to C++ relaxed atomics,
>> except for long and double, which could perhaps be mapped to an array of
>> atomic bytes, technically eliminating all data races. Both C++ and Java
>> try to prevent OOTA results, and both succeed in practice, if not in
>> theory.
>> http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2013/n3710.html
>>
>
>
From sanjoy at playingwithpointers.com Sat Jun 25 04:34:14 2016
From: sanjoy at playingwithpointers.com (Sanjoy Das)
Date: Fri, 24 Jun 2016 21:34:14 -0700
Subject: [jmm-dev] Optimizing external actions in the JMM
Message-ID:
Hi,
I'm stuck trying to model a certain kind of execution in the JMM, and
was wondering if people here can help.
Let's say we have a program like this:
x == y == 0 initially
ThreadA:
r1 = x
r2 = f(r1)
y = r2
ThreadB:
r3 = y
x = r3
where f is some external action that the compiler understands. If the
compiler knows `f` always returns 42 and has no other effect, can it
optimize ThreadA to
ThreadA:
r1 = x
r2 = 42 // f(r1)
y = r2
and then to
ThreadA:
r2 = 42 // f(r1)
y = r2
r1 = x
thereby introducing a OOTA-like value of 42 into the system?
Given how the JMM is phrased, in cases like say, f(x) == (x - x) + 42,
(i.e. `f` is _not_ a external action) I'd proceed to justify r1 = r2 = r3 = 42
by:
1. Committing the initial writes of 0 to x and y in E0
2. Adding the load of r1 as 0 to E1 (but not committing it) and
committing the write of 42 to y (justified since f(0) = 42) in E1
3. Nothing in E2
4. Now the load from y == r3 can see the write committed to y in E1,
so commit the read r3 and the write of 42 to x in E3
5. Nothing in E4
6. In E5, finally commit the read from x == r1, which can see 42 now
However, if `f` is a side-effecting operation, then I don't see how I
can reason about `f(0)` in step 2 inside the memory model. As far as
the JMM goes, `f` is a black box, and the only thing known about it
given the trace (with the OOTA-like behavior) is that "f(42) == 42".
This is fine for the vast majority of external actions that are black
boxes to the compiler also. But `f` could be a routine in (say)
DirectByteBuffer that touches native memory under the hood; and the
compiler could very well be in a position to fold loads from native
memory (that aren't modeled in the JMM AFAICT, and thus appear as
"external actions").
Another potentially problematic definition for `f(x)` is:
f(x) =
// JNI_GetEnv is the "external action" here but the JIT
// knows that no loaded class calls `setenv`, so it is safe to assume
// that JNI_GetEnv is invariant for an invariant argument.
String EnvVar = "VAR_" + Integer.toString(x);
if (JNI_GetEnv(EnvVar) == JNI_GetEnv(EnvVar))
return 42;
return 90;
It is perhaps not too unreasonable to teach the compiler to fold `f`
above to "return 42", and then optimize ThreadA to enable "r1 = r2 =
r3 = 42". But justifying this in the JMM is difficult -- I'll end up
having to justify that f(0) is 42, and that is difficult to do just by
considering reads, writes and results of external actions already
present in the trace.
It seems to me that this isn't just a problem with the current JMM,
but is a general issue with a JMM that is phrased as a predicate on a
program trace -- a predicative memory rules out the compiler from
optimizing external actions as first class entities. Is that
accurate, or am I off base?
Thank you for your time!
-- Sanjoy
From boehm at acm.org Sat Jun 25 07:42:55 2016
From: boehm at acm.org (Hans Boehm)
Date: Sat, 25 Jun 2016 00:42:55 -0700
Subject: [jmm-dev] Specifying VarHandle acquire/release without ill
effects
In-Reply-To:
References: <57630BD5.60004@oracle.com> <5763B012.9090500@redhat.com>
<57653569.40500@cs.oswego.edu>
Message-ID:
I disagree somewhat. If you expect a variable will be used in a racy way,
then there's a fairly strong argument that programmers expect single
variable cache coherence, and you should try to provide it. Especially if
it doesn't require special hardware instructions, which it doesn't on most
architectures. Programmers just don't expect a counter that's only
modified by getting incremented by a single thread to decrease.
The situation is different if the variable is expected to be used in a
race-free way, where the cache coherence guarantee buys you nothing at some
performance cost. I think this is one of several reasons not to conflate
race-free accesses with unordered atomic accesses.
On Fri, Jun 24, 2016 at 5:21 PM, Martin Buchholz
wrote:
>
>
> On Thu, Jun 23, 2016 at 9:45 PM, Hans Boehm wrote:
>
>> There is another important difference between C++ relaxed atomics and
>> Java ordinary variables. C++ relaxed atomics guarantee that accesses to a
>> single variable, independent of all other accesses, look sequentually
>> consistent. I.e. we promise "cache coherence". Java does not for ordinary
>> variables. And that's pretty critical for compiler optimization on the Java
>> side. (It also affects fencing on one or two hardware platforms.)
>>
>
> Single variable cache coherence in C++ is unexpected and seems not in the
> spirit of C++. I'm happy to keep that feature out of Java.
>
> But I'm still a fan of how acquire/release and fences are defined in C++
> (inspired by Java!) and hoping for some reciprocal inspiration.
>
>
>
>>
>> On Mon, Jun 20, 2016 at 10:48 AM, Martin Buchholz
>> wrote:
>>
>>> On Sat, Jun 18, 2016 at 4:50 AM, Doug Lea wrote:
>>>
>>> >
>>> > But we then discovered that doing this encountered some unsolved
>>> > research problems. Mainly that the current C/C++ model relies on rules
>>> > saying that programs with data races have no semantics ... Plus there
>>> are
>>> > a few
>>> > minor-mismatch snags that do not seem hard to address, including that
>>> > plain Java variables are a little stronger than plain C/CC variables
>>> > (for example references cannot be word-torn), but weaker than atomics
>>> > in relaxed mode (they are not required to be coherent).
>>> >
>>>
>>> My mental model is to map plain old java variables to C++ relaxed
>>> atomics,
>>> except for long and double, which could perhaps be mapped to an array of
>>> atomic bytes, technically eliminating all data races. Both C++ and Java
>>> try to prevent OOTA results, and both succeed in practice, if not in
>>> theory.
>>> http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2013/n3710.html
>>>
>>
>>
>
From boehm at acm.org Sat Jun 25 07:48:06 2016
From: boehm at acm.org (Hans Boehm)
Date: Sat, 25 Jun 2016 00:48:06 -0700
Subject: [jmm-dev] Specifying VarHandle acquire/release without ill
effects
In-Reply-To: <5763B613.4060801@oracle.com>
References: <57630BD5.60004@oracle.com> <5763B012.9090500@redhat.com>
<5763B613.4060801@oracle.com>
Message-ID:
I think C++'s definitions of acquire/release, as refined by the work by
Mark Batty et al, are in pretty good shape. But I strongly suspect they
are incompatible with Java's notion of a total synchronization order. That
order is not very meaningful for acquire/release operations. Specifying
acquire/release correctly in a Java context does require some work.
On Fri, Jun 17, 2016 at 1:34 AM, Aleksey Shipilev <
aleksey.shipilev at oracle.com> wrote:
> On 06/17/2016 11:08 AM, Andrew Haley wrote:
> > On 16/06/16 21:28, Aleksey Shipilev wrote:
> >
> >> The clause about reorderings trips users into believing that
> >> getAcquire and putRelease have fence-like semantics. But in reality,
> >> they are weaker than fences, because optimizers are not obliged to
> >> introduce barriers around the operation to get the needed semantics.
> >
> > I've read this several times now and I still do not understand it.
> > Release and acquire are well-defined terms in the industry,
>
> I dare anyone to produce the consistent industry definition of "acquire"
> and "release" :)
>
> > and I can't think of any great reason to add more words like yours,
> > which seem to make things even more confusing than they were. The
> > wording which is there seems to me to be accurate, at least, as a
> > description of release and acquire.
>
> That might be true if you take the *hardware* definition of
> acquire/releases, where prohibiting reorderings is surely a plausible
> semantics. (There is no telling if CPU vendors will some day describe
> their hardware memory models with abstract rules, like high-level
> languages do)
>
> But the trouble is with allowing software optimizations, as volatile
> example shows. A runtime should be able to elide barriers around
> volatiles, if this elision does not violate JMM. In fact, it is
> trivially doable when optimizers realize they are dealing with
> thread-local volatile ops (e.g. on a non-escaped object, or in
> constructor). This is what my original example shows.
>
> Now, VarHandle's acquire/release can be seen as further _relaxations_ of
> volatile semantics, when you give up sequential consistency for better
> performance. Therefore, the barriers around acquire/releases should also
> be elide-able. Therefore, specifying VarHandle's acquire/release with
> reoderings is odd.
>
> You can certainly specify that acq/rel inhibit reordering, but this way
> we face even a weirder inconsistency. It means that volatiles are *more
> optimizeable* than acq/rel, which defeats the purpose of having acq/rel
> to begin with!
>
> HTHS,
> -Aleksey
>
>
From martinrb at google.com Mon Jun 27 23:39:42 2016
From: martinrb at google.com (Martin Buchholz)
Date: Mon, 27 Jun 2016 16:39:42 -0700
Subject: [jmm-dev] Specifying VarHandle acquire/release without ill
effects
In-Reply-To:
References: <57630BD5.60004@oracle.com> <5763B012.9090500@redhat.com>
<57653569.40500@cs.oswego.edu>
Message-ID:
In Java even plain variables default to atomic, so the cost of guaranteeing
special properties for atomics would be much higher than in C++ where
atomic<> will be rare. Would it mean that plain fields can no longer be
cached in registers?
It is surprising to discover that relaxed atomics provide a kind of
sequential consistency, but restricted to a single memory location. It may
be the right decision for C++.
On Sat, Jun 25, 2016 at 12:42 AM, Hans Boehm wrote:
> I disagree somewhat. If you expect a variable will be used in a racy way,
> then there's a fairly strong argument that programmers expect single
> variable cache coherence, and you should try to provide it. Especially if
> it doesn't require special hardware instructions, which it doesn't on most
> architectures. Programmers just don't expect a counter that's only
> modified by getting incremented by a single thread to decrease.
>
> The situation is different if the variable is expected to be used in a
> race-free way, where the cache coherence guarantee buys you nothing at some
> performance cost. I think this is one of several reasons not to conflate
> race-free accesses with unordered atomic accesses.
>
> On Fri, Jun 24, 2016 at 5:21 PM, Martin Buchholz
> wrote:
>
>>
>>
>> On Thu, Jun 23, 2016 at 9:45 PM, Hans Boehm wrote:
>>
>>> There is another important difference between C++ relaxed atomics and
>>> Java ordinary variables. C++ relaxed atomics guarantee that accesses to a
>>> single variable, independent of all other accesses, look sequentually
>>> consistent. I.e. we promise "cache coherence". Java does not for ordinary
>>> variables. And that's pretty critical for compiler optimization on the Java
>>> side. (It also affects fencing on one or two hardware platforms.)
>>>
>>
>> Single variable cache coherence in C++ is unexpected and seems not in the
>> spirit of C++. I'm happy to keep that feature out of Java.
>>
>> But I'm still a fan of how acquire/release and fences are defined in C++
>> (inspired by Java!) and hoping for some reciprocal inspiration.
>>
>>
>>
>>>
>>> On Mon, Jun 20, 2016 at 10:48 AM, Martin Buchholz
>>> wrote:
>>>
>>>> On Sat, Jun 18, 2016 at 4:50 AM, Doug Lea wrote:
>>>>
>>>> >
>>>> > But we then discovered that doing this encountered some unsolved
>>>> > research problems. Mainly that the current C/C++ model relies on rules
>>>> > saying that programs with data races have no semantics ... Plus there
>>>> are
>>>> > a few
>>>> > minor-mismatch snags that do not seem hard to address, including that
>>>> > plain Java variables are a little stronger than plain C/CC variables
>>>> > (for example references cannot be word-torn), but weaker than atomics
>>>> > in relaxed mode (they are not required to be coherent).
>>>> >
>>>>
>>>> My mental model is to map plain old java variables to C++ relaxed
>>>> atomics,
>>>> except for long and double, which could perhaps be mapped to an array of
>>>> atomic bytes, technically eliminating all data races. Both C++ and Java
>>>> try to prevent OOTA results, and both succeed in practice, if not in
>>>> theory.
>>>> http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2013/n3710.html
>>>>
>>>
>>>
>>
>
From martinrb at google.com Mon Jun 27 23:50:39 2016
From: martinrb at google.com (Martin Buchholz)
Date: Mon, 27 Jun 2016 16:50:39 -0700
Subject: [jmm-dev] Specifying VarHandle acquire/release without ill
effects
In-Reply-To:
References: <57630BD5.60004@oracle.com> <5763B012.9090500@redhat.com>
<5763B613.4060801@oracle.com>
Message-ID:
(I'm not qualified to revise the memory model, but I can handwave ...)
On Sat, Jun 25, 2016 at 12:48 AM, Hans Boehm wrote:
> I think C++'s definitions of acquire/release, as refined by the work by
> Mark Batty et al, are in pretty good shape. But I strongly suspect they
> are incompatible with Java's notion of a total synchronization order.
https://docs.oracle.com/javase/specs/jls/se8/html/jls-17.html#jls-17.4.6
Current memory model has:
- synchronization order, which is a total order over all synchronization
actions
- synchronizes-with, a partial order over synchronization actions
- happens-before, a partial order over actions
and (hand-waving) "all we need to do" is allow synchronizes-with to also
apply to actions that are not synchronization actions (acquires and
releases)
From boehm at acm.org Tue Jun 28 00:20:20 2016
From: boehm at acm.org (Hans Boehm)
Date: Mon, 27 Jun 2016 17:20:20 -0700
Subject: [jmm-dev] Specifying VarHandle acquire/release without ill
effects
In-Reply-To:
References: <57630BD5.60004@oracle.com> <5763B012.9090500@redhat.com>
<5763B613.4060801@oracle.com>
Message-ID:
I would probably define synchronization actions to include acquire and
release operations, and rename synchronization order to something else (SC
order?), and ensure that it includes only SC operations. That's what C++
currently does, effectively.
On Mon, Jun 27, 2016 at 4:50 PM, Martin Buchholz
wrote:
> (I'm not qualified to revise the memory model, but I can handwave ...)
>
> On Sat, Jun 25, 2016 at 12:48 AM, Hans Boehm wrote:
>
>> I think C++'s definitions of acquire/release, as refined by the work by
>> Mark Batty et al, are in pretty good shape. But I strongly suspect they
>> are incompatible with Java's notion of a total synchronization order.
>
>
> https://docs.oracle.com/javase/specs/jls/se8/html/jls-17.html#jls-17.4.6
> Current memory model has:
>
> - synchronization order, which is a total order over all synchronization
> actions
> - synchronizes-with, a partial order over synchronization actions
> - happens-before, a partial order over actions
>
> and (hand-waving) "all we need to do" is allow synchronizes-with to also
> apply to actions that are not synchronization actions (acquires and
> releases)
>
>
From martinrb at google.com Tue Jun 28 01:05:40 2016
From: martinrb at google.com (Martin Buchholz)
Date: Mon, 27 Jun 2016 18:05:40 -0700
Subject: [jmm-dev] Specifying VarHandle acquire/release without ill
effects
In-Reply-To:
References: <57630BD5.60004@oracle.com> <5763B012.9090500@redhat.com>
<5763B613.4060801@oracle.com>
Message-ID:
On Mon, Jun 27, 2016 at 5:20 PM, Hans Boehm wrote:
> I would probably define synchronization actions to include acquire and
> release operations, and rename synchronization order to something else (SC
> order?), and ensure that it includes only SC operations. That's what C++
> currently does, effectively.
>
Reading the C++ standard, I see:
- they use "synchronization operations" instead of "synchronization
actions". We should adopt the former, incorporating the release/acquire
operations as they do.
- they don't have a real name for "synchronization order"; they just give
it a temporary name """There shall be a single total order S on all
memory_order_seq_cst operations"""