[jmm-dev] jdk9 APIs [Fences specifically]

Paul E. McKenney paulmck at linux.vnet.ibm.com
Fri Aug 14 12:39:04 UTC 2015

On Thu, Aug 13, 2015 at 04:56:16PM -0700, Hans Boehm wrote:
> On Thu, Aug 13, 2015 at 4:19 PM, Doug Lea <dl at cs.oswego.edu> wrote:
> >
> > On 08/13/2015 05:04 PM, Hans Boehm wrote:
> >
> >> I don't think a fence-based approach works.  Deferring all the stores to
> the
> >> end of the loop fundamentally remains correct, even with the StoreStore
> >> fence, since it's consistent with the producer just running very fast
> for a
> >> while. The constraint you're trying to enforce has nothing to do with
> >> ordering.
> >
> >
> > I must be missing something fundamental about C++ specs. Are C++
> > compilers allowed to ignore release fences in between writes
> > to the same variables? In unrolled form, that's what this would
> > amount to here.
> I think that's unavoidable.  If I write
> for (...) {
>     x = something_expensive();
>     fence;
> }
> it's very hard to prevent the implementation from implementing that as
> <pause for a while>
> <run the above loop instantaneously and atomically>

As in place an unrolled version of the loop into a hardware sort-of
transaction, like the ones available on recent x86, Power, and mainframe
systems?  If that is what you are getting at, this scenario sounds like
an excellent argument for having some way to tell the compiler not to
make that transformation.  Perhaps the fence above should be interpreted
as such saying to bound any transactions.

> And that looks exactly like merging all the stores into one.

It would, but don't expect any real-time programmers to thank you
for doing that to their code.  ;-)

> I think you can't really disallow the latter without disallowing a
> time-slicing
> uniprocessor scheduler.  They're behaviorally identical.

If you are excluding performance and response-time characteristics
from "behaviorally", I can't argue.  Again, don't expect any thanks
from real-time programmers.

> That doesn't prevent us from providing the compiler with advice to
> discourage
> that.  But I think this doesn't have anything to do with fences.

Volatile fences, then?  ;-)

> >> Aside from not working correctly, you end up slowing down ARM code in
> ways
> >> that are entirely unnecessary, by inserting "dmb ishld" or "dmb ishst"
> >> fences everywhere.  (How expensive they are varies.  On a number of
> >> implementations they basically seem to be full fences.)
> >
> >
> > Right. It does put the programmer in control though; for example
> >   if ((i % 100) == 99) storeStoreFence()
> Agreed.  But it seems to me that this is control over an unnecessary
> trade-off.  Clearly the ideal code involves no fences.  And in most cases,
> just having the programmer specifying where the stores to shared variables
> should go, and having the compiler leave that alone, seems like a better
> and simpler way to control this.

In all cases?

							Thanx, Paul

> Hans
> >
> > Considering that the goal is communication latency reduction at
> > the expense of throughput, only the programmer would be able
> > to make these tradeoffs.
> >
> > -Doug
> >
> >

More information about the jmm-dev mailing list