[aarch64-port-dev ] RFR(s): 8171449" [aarch64] store_klass needs to use store release

Wed Dec 21 12:22:11 UTC 2016

Hi Andrew,

On Tue, 2016-12-20 at 16:42 +0000, Andrew Haley wrote:
> Hi,
> 
> On 20/12/16 14:40, White, Derek wrote:
> 
> > 
> > Some background. There are two synchronization issues around object
> > initialization:
> > 1) The allocating thread needs to ensure that the object is fully
> > initialized (to default zeros or specified values) before another
> > Java thread can see the object. This case is well handled with
> > memory barriers etc.
> > 
> > 2) A concurrent GC might find an object during heap scanning that
> > has been allocated but not yet initialized. At the least it needs
> > to know the size of the object if it's to reason about it. To
> > enable this, the contract between the runtime and the concurrent
> > collectors is that the length of an array (and 'forgotten case B'),
> > is written before the klass word is installed in the header. If CMS
> > finds an object with a null klass word, it either retries,
> > terminates what it's doing, or uses a back-up method for finding
> > the object size.
>
> I've had a look at what was confusing me so much, and I think I have
> found something.  In
> G1CollectedHeap::humongous_obj_allocate_initialize_regions I see
> this:
> 
>   // First, we need to zero the header of the space that we will be
>   // allocating. When we update top further down, some refinement
>   // threads might try to scan the region. By zeroing the header we
>   // ensure that any thread that will try to scan the region will
>   // come across the zero klass word and bail out.
>   //
>   Copy::fill_to_words(new_obj, oopDesc::header_size(), 0);
> 
> ...
> 
>   OrderAccess::storestore();
> 
> > 
> > This was fixed, but discussion led to the point that a compiler or
> > weak memory-model CPU might also write the fields out-of-order, so
> > a series of fixes changed the concurrent GC code to use load-
> > acquires when necessary. This is JDK-8160369 and sub-tasks. See in
> > particular oopDesc:: klass_or_null_acquire().
> Sure.
> 
> > 
> > As far as which GCs need to worry about this goes, CMS is clearly
> > in
> > danger with this issue on weak memory model systems. I don't have a
> > definitive answer for G1. Thomas makes a good argument that in G1,
> > concurrent GC would only scan a newly allocated object if it were
> > humongous, and there are enough memory barriers around allocating a
> > humungous region that we should be safe. But there were changes
> > made
> > to G1 to use oopDesc:: klass_or_null_acquire(). See
> > http://hg.openjdk.java.net/jdk9/hs/hotspot/rev/1a33f585a889
> > . Perhaps these are overly conservative?
> After an object is allocated and before it is zeroed, the object is
> garbage.  This includes the klass field, which probably is non-null.
> There is a window in time between the memory being allocated and the
> klass field being written.  So, I suppose until the klass field is
> written, some memory which is about to become an array must not be
> visible to CMS.  It must not be visible because it must not be
> possible for CMS to see a garbage klass field.

s/CMS/G1?

At that point "top" of the region must be the same as "bottom" in this
case. To be allocated, a region must have been "Free" before that; we
set to "Free" and reset a regions' "top" to "bottom" only during STW
pause, so this must be visible.

So this card will be filtered out by the gc thread and that area not
scanned, the check is at g1RemSet.cpp:668. Also see the comment at line
659.

I hope this is the answer to your question.

Hth,
  Thomas