[concurrency-interest] RFR: 8065804: JEP171:Clarifications/corrections for fence intrinsics
Oleksandr Otenko
oleksandr.otenko at oracle.com
Tue Dec 9 20:04:16 UTC 2014
On 26/11/2014 02:04, Hans Boehm wrote:
> To be concrete here, on Power, loads can normally be ordered by an
> address dependency or light-weight fence (lwsync). However, neither
> is enough to prevent the questionable outcome for IRIW, since it
> doesn't ensure that the stores in T1 and T2 will be made visible to
> other threads in a consistent order. That outcome can be prevented by
> using heavyweight fences (sync) instructions between the loads instead.
Why would they need fences between loads instead of syncing the order of
stores?
Alex
> Peter Sewell's group concluded that to enforce correct volatile
> behavior on Power, you essentially need a a heavyweight fence between
> every pair of volatile operations on Power. That cannot be understood
> based on simple ordering constraints.
>
> As Stephan pointed out, there are similar issues on ARM, but they're
> less commonly encountered in a Java implementation. If you're lucky,
> you can get to the right implementation recipe by looking at only
> reordering, I think.
>
>
> On Tue, Nov 25, 2014 at 4:36 PM, David Holmes
> <davidcholmes at aapt.net.au <mailto:davidcholmes at aapt.net.au>> wrote:
>
> Stephan Diestelhorst writes:
> >
> > David Holmes wrote:
> > > Stephan Diestelhorst writes:
> > > > Am Dienstag, 25. November 2014, 11:15:36 schrieb Hans Boehm:
> > > > > I'm no hardware architect, but fundamentally it seems to
> me that
> > > > >
> > > > > load x
> > > > > acquire_fence
> > > > >
> > > > > imposes a much more stringent constraint than
> > > > >
> > > > > load_acquire x
> > > > >
> > > > > Consider the case in which the load from x is an L1 hit, but a
> > > > > preceding load (from say y) is a long-latency miss. If we
> enforce
> > > > > ordering by just waiting for completion of prior
> operation, the
> > > > > former has to wait for the load from y to complete; while the
> > > > > latter doesn't. I find it hard to believe that this
> doesn't leave
> > > > > an appreciable amount of performance on the table, at
> least for
> > > > > some interesting microarchitectures.
> > > >
> > > > I agree, Hans, that this is a reasonable assumption.
> Load_acquire x
> > > > does allow roach motel, whereas the acquire fence does not.
> > > >
> > > > > In addition, for better or worse, fencing requirements on
> at least
> > > > > Power are actually driven as much by store atomicity
> issues, as by
> > > > > the ordering issues discussed in the cookbook. This was not
> > > > > understood in 2005, and unfortunately doesn't seem to be
> > amenable to
> > > > > the kind of straightforward explanation as in Doug's
> cookbook.
> > > >
> > > > Coming from a strongly ordered architecture to a weakly
> ordered one
> > > > myself, I also needed some mental adjustment about store
> (multi-copy)
> > > > atomicity. I can imagine others will be unaware of this
> difference,
> > > > too, even in 2014.
> > >
> > > Sorry I'm missing the connection between fences and multi-copy
> > atomicity.
> >
> > One example is the classic IRIW. With non-multi copy atomic
> stores, but
> > ordered (say through a dependency) loads in the following example:
> >
> > Memory: foo = bar = 0
> > _T1_ _T2_ _T3_ _T4_
> > st (foo),1 st (bar),1 ld r1, (bar) ld r3,(foo)
> > <addr dep / local "fence" here>
> <addr dep>
> > ld r2, (foo) ld r4, (bar)
> >
> > You may observe r1 = 1, r2 = 0, r3 = 1, r4 = 0 on non-multi-copy
> atomic
> > machines. On TSO boxes, this is not possible. That means that the
> > memory fence that will prevent such a behaviour (DMB on ARM)
> needs to
> > carry some additional oomph in ensuring multi-copy atomicity, or
> rather
> > prevent you from seeing it (which is the same thing).
>
> I take it as given that any code for which you may have ordering
> constraints, must first have basic atomicity properties for loads and
> stores. I would not expect any kind of fence to add
> multi-copy-atomicity
> where there was none.
>
> David
>
> > Stephan
> >
> > _______________________________________________
> > Concurrency-interest mailing list
> > Concurrency-interest at cs.oswego.edu
> <mailto:Concurrency-interest at cs.oswego.edu>
> > http://cs.oswego.edu/mailman/listinfo/concurrency-interest
>
> _______________________________________________
> Concurrency-interest mailing list
> Concurrency-interest at cs.oswego.edu
> <mailto:Concurrency-interest at cs.oswego.edu>
> http://cs.oswego.edu/mailman/listinfo/concurrency-interest
>
>
>
>
> _______________________________________________
> Concurrency-interest mailing list
> Concurrency-interest at cs.oswego.edu
> http://cs.oswego.edu/mailman/listinfo/concurrency-interest
More information about the core-libs-dev
mailing list