Release store in C2 putfield

Fri Sep 5 13:08:54 UTC 2014

I'm trying to disentangle the many interrelated issues here,
mainly wrt to JMM and possible revisions. A few notes/comments:

1. As far as I can see, the G1 post barrier enforces ordering,
but the plain one (GraphKit::write_barrier_post) does not.
On the other hand, some GC barrier-related mechanics seem to
be strewn elsewhere, so might have this effect. In particular,
the release inside Parse::do_put_xxx seems suspicious.
I'd expect CMS, but not the other GCs, to have the
same constraints as G1. I'd also expect the ordering constraints
to sometimes have a significant overall performance cost on ARM and
Power. (The G1 ordering enforcement changes were apparently
performance tested only on TSO machines; see
http://mail.openjdk.java.net/pipermail/hotspot-dev/2013-October/011077.html
and follow-ups).
(Aside: Yet more reasons to hate card-marking.)

2. Hans Boehm has argued/demonstrated over the years (see for
example, http://hboehm.info/c++mm/no_write_fences.html), that
StoreStore fences, as opposed to release==(StoreStore|StoreLoad)
fences, are too delicate and anomaly-filled to expose as a
programming mode. But there are cases where they may come into
play, for example as the first fence of a volatile-store
(that also requires a trailing StoreLoad), that might be
profitable to separate if any other internal mechanics could
then be applied to further optimize. And even if not generally
useful, they seem to apply to the GC post_barrier case.

3. We are indeed strongly considering simplifying the revised
Java Memory Model to require release fencing on construction,
not just in the presence of final fields. In the ideal implementation,
this would require multiple fences (i.e., more than the one
needed anyway to force object header sanity) only if "this"
escapes within a constructor. But because some of these fences are
currently hard-wired and so not amenable to elision/optimization,
carrying this out on non-TSO will probably take some effort.

4. Reminder to Andrew: We cannot let the VM crash when people
write racy/wrong code including unsafe publication.

-Doug

On 09/04/2014 09:30 AM, Andrew Haley wrote:
> On 09/04/2014 01:19 PM, Bertrand Delsart wrote:
>> On 04/09/14 12:30, Andrew Haley wrote:
>>> On 09/04/2014 10:30 AM, Bertrand Delsart wrote:
>>>> I'm not a C2 expert but from what I have quickly checked, an issue may
>>>> be that we need StoreStore ordering on some platforms.
>>>>
>>>> This should for instance be true for cardmarking (new stored oop
>>>> must be visible before the card is marked).
>>>
>>> Okay.  I can live with that.  Is there a corresponding read barrier in
>>> the code which scans the card table?
>>
>> See for instance the storeload() in
>> G1SATBCardTableLoggingModRefBS::write_ref_field_work
>
> Okay, thanks, that is tremendously helpful.  I know what to look for
> now.
>
>> There are in fact a lot of other barriers in concurrent card scanning
>> and cleaning (some of them being implicit due to compare and swap
>> operations).
>>
>>>> This may also be true for the oop stores in general, as initially
>>>> discussed. [IMHO this is related to final fields, which have to be
>>>> visible when the object escape. Barriers are the end of the
>>>> constructors may not be sufficient if objects can can escape before
>>>> the end of their init() method.
>>>
>>> I'm pretty sure we do this correctly.  Are you aware of any place
>>> (except unsafe publication, which is a programmer error) where this
>>> might happen?  We generate a barrier at the end of a constructor if
>>> there is a final field and at the end of object creation.
>>
>> I agree that this is not a good programming style but I'm not sure this
>> can always be considered a programmer error. Do you see anything in the
>> java specification that forbid publication before the end of object
>> creation ?
>
> No, but there's nothing in the Java spec which says that the language
> will protect a programmer from themself.
>
>> For instance, objects may have to be linked at creation time. In
>> general, the publication should be safe because it will hit barriers
>> (because what the object is exported too will often needs to be
>> protected). However, I do not think this is mandatory according to the
>> specifications.
>
> That's right.
>
>> Now, the problem is to see what the JMM requires in that case. I'm not
>> 100% sure that a StoreStore is needed here. This is why I said "may not
>> be sufficient". The JSR-133 cookbook has several "(outside of
>> constructor)" statements that might mean it is not needed (if you have a
>> membar at the end of the constructor). However, while I'm familiar with
>> barriers because of my runtime, GC and embedded background, I do not
>> consider myself to be a JMM expert. I will let one chime in.
>>
>> Of course, from a support point of view, it may be easier to add a
>> StoreStore semantic on oop stores (should not be too expensive, taking
>> into account the cost of GC barriers cost and the frequency of oop
>> store) than to investigate the kind of troubles a strange ordering can
>> lead to and explain to the customer why his Java code must be changed
>> for platforms with weaker memory models.
>
> I think programmers are going to have to get used to it.  The issue of
> safe publication is very well known, especially because of the book
> _Java Concurrency in Practice_.
>
> AIUI the purpose of the JMM is to give a clear definition of the
> memory semantics of Java that can be efficiently executed on a wide
> variety of machines.  We need Java to scale well on machines with many
> cores, and the JMM is a good fit to that.
>
>> Did you measure the performance regression ?
>
> Yes.  It is high; but I can't provide any numbers.
>
> Andrew.
>