[RESUMED] RFR: 8158946 - btree009 fails with assert(s > 0) failed: Bad size calculated
Derek White
derek.white at oracle.com
Mon Jun 27 14:10:25 UTC 2016
I'd like to split out the memory fence issue from the race fixed by this
webrev. I think the fence issue may require more performance testing and
several attempts to get something satisfactory.
New bug created:
JDK-8160369 <https://bugs.openjdk.java.net/browse/JDK-8160369>
Memory fences needed around setting and reading object lengths.
How do reviewers feel about this patch to fix the initial race condition?
- Derek
On 6/22/16 4:38 PM, Thomas Schatzl wrote:
> Hi,
>
> On Wed, 2016-06-22 at 15:33 -0400, Derek White wrote:
>> Hi Thomas,
>>
>> On 6/22/16 3:01 PM, Thomas Schatzl wrote:
>>> Hi,
>>>
> [...]
>>> CollectedHeap::array_allocate() is not used for humongous objects,
>>> but
>>> G1CollectedHeap::humongous_obj_allocate().
>>
>> That can't be right - G1CollectedHeap::humongous_obj_allocate()
>> doesn't set the object header (it doesn't even know the Klass). It
>> clears the object header, and does the storestore before updating the
>> heap region bookkeeping that makes the new object scannable. At that
>> point the new object is a valid uninitialized object.
>>
>> G1CollectedHeap::humongous_obj_allocate() is called by
>> Universe::heap()->mem_allocate() is called by
>> CollectedHeap::common_mem_allocate_noinit() is called by various
>> CollectedHeap::XXX_allocate()
>>
>> But what Kim is concerned about is the ordering of setting the object
>> header (lock and klass fields) and setting either the array length or
>> the "oop_size" field of a java.lang.Class instance. We (GC) never
>> want to see an object with a non-zero klass in the header and an
>> unset array length or oop_size. These fields are set up in
>> CollectedHeap::post_allocation_install_obj_klass() (and neighbors),
>> but there is no ordering enforced between the stores.
> Bug? I mean, if the comment already says it needs this ordering, and
> there is no explicit barrier, it's a bug in any case. At least on non-
> TSO platforms the cpu will reorder the stores, and on others there is
> still the issue with potential compiler reordering.
>
> Guess we were kind of lucky? Or one of these spurious crash reports we
> have is caused by this?
>
>> I think we're primarily worried by concurrent GC threads (G1 or CMS)
>> seeing these new objects as they are being created. So we aren't
>> concerned about young gen objects. There's some evidence that CMS is
>> synchronizing access between allocators and concurrent scanners (see
>> below), but I don't know if there are similar issues with G1.
> There unfortunately seem to be quite a few potentially concurrent
> callers (refinement - scanning the card, marking - advancing the
> finger) of oopDesc::size_given_klass() and I can't convince myself we
> have an address dependency on klass in that method for array size
> calculation (which would imply a loadload barrier on the platforms
> needed) - other objects seem fine, as they read a variable based off
> the klass pointer.
>
> (And on TSO systems again it did not matter except for the potential
> compiler reordering, which seems almost impossible here)
>
> But please check that yourselves too.
>
> Thanks,
> Thomas
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20160627/3d0b6cee/attachment.htm>
More information about the hotspot-gc-dev
mailing list