missing memory barrier in acmp with C2

Vitaly Davidovich vitalyd at gmail.com
Wed Oct 26 15:31:05 UTC 2016


On Wednesday, October 26, 2016, Andrew Haley <aph at redhat.com> wrote:

> On 26/10/16 15:02, Vitaly Davidovich wrote:
> > On Wednesday, October 26, 2016, Andrew Haley <aph at redhat.com
> <javascript:;>> wrote:
> >
> >> On 26/10/16 12:27, Roman Kennke wrote:
> >>> Am Mittwoch, den 26.10.2016, 13:24 +0200 schrieb Roland Westrelin:
> >>>> http://cr.openjdk.java.net/~roland/shenandoah/membar-acmp/webrev.00/
> >>>>
> >>>> The code generated for acmp is missing a memory barrier.
> >>>
> >>> Great!
> >>>
> >>>> Should it be a loadstore + loadload as in
> >>>> ShenandoahBarrierSet::asm_acmp_barrier() on aarch64 or simply a
> >>>> loadload?
> >>>
> >>> I can come up with a reason for loadload, but not for loadstore, I
> >>> think loadstore is not necessary there. I'd go for the less restrictive
> >>> fence unless we come up with a good reason not to.
> >>
> >> The general rule is that you can get away with loadload fences if you
> >> really know what you are doing, but it is exceedingly subtle.
> >>
> >> Imagine this.  We have two variables, a boolean x_init and an oop
> >> x.
> >>
> >> Thread 1:
> >> <Initialize x>
> >> x_init.store_release(true);
> >>
> >> Thread 2:
> >> if (x_init.load_aquire())
> >>     x.blah = y
> >>
> >> If you replace the load acquire with a loadload fence, the store of
> >> x.blah can become visible before the initialization of x.
> >
> > x.blah requires a load of x (which cannot reorder with loadload)
>
> x is just a local, and it's in a register.  Where would you even load
> it from?

I don't follow - x is an oop, and x.blah is at (addr of x) + (offset of
blah field).  You need to load addr of x to figure out dest addr of the
store.  As written in your snippet, the load of x is after the loadload.
So what am I missing?

>
> > and it's data dependent; unless you take something like Alpha into
> > account, but that's unsupported anyway.
>
> Please explain.  And, while you're at it, please explain why Hans is
> wrong, or why my interpretation is wrong.

As mentioned above, to get x.blah address you need a load of x (or have the
address available already) - that's data dependent load.  AFAIK, Alpha is
the commonly referenced arch that doesn't respect such data dependent loads
- it can speculate on the address of x and proceed to compute x.blah ahead
of resolving x itself.

I'm not saying anyone is wrong, just trying to identify why you think your
example is valid on archs other than Alpha and the like.

>
> Andrew.
>
>
>

-- 
Sent from my phone


More information about the shenandoah-dev mailing list