[RESUMED] RFR: 8158946 - btree009 fails with assert(s > 0) failed: Bad size calculated

Wed Jun 22 19:33:24 UTC 2016

Hi Thomas,

On 6/22/16 3:01 PM, Thomas Schatzl wrote:
> Hi,
>
> On Wed, 2016-06-22 at 14:34 -0400, Derek White wrote:
>> Hi Thomas,
>>
>> Thanks for the comments! Questions below...
>>
>>
> [...]
>> Maybe this only causes problems when the object is allocated in
>>>> the
>>>> old gen (perhaps because it is large).  Is there some other path
>>>> for
>>>> large arrays, so we don't have a barrier for every array
>>>> allocation?
>>>> I hope I'm missing something...
>>> I do not know for CMS, but G1's humongous objects do have a
>>> storestore barrier at the correct place (and it should have the
>>> corresponding at the reader side). These are the only direct old
>>> gen allocations G1 ever does.
>> Where is this barrier used? I thought the header setting was done up
>> at CollectedHeap::array_allocate(), outside of G1 code?
> CollectedHeap::array_allocate() is not used for humongous objects, but
> G1CollectedHeap::humongous_obj_allocate().

That can't be right - G1CollectedHeap::humongous_obj_allocate() doesn't 
set the object header (it doesn't even know the Klass). It /clears/ the 
object header, and does the storestore before updating the heap region 
bookkeeping that makes the new object scannable. At that point the new 
object is a valid uninitialized object.

G1CollectedHeap::humongous_obj_allocate()      is called by
   Universe::heap()->mem_allocate()             is called by
    CollectedHeap::common_mem_allocate_noinit() is called by various
      CollectedHeap::XXX_allocate()

But what Kim is concerned about is the ordering of setting the object 
header (lock and klass fields) and setting either the array length or 
the "oop_size" field of a java.lang.Class instance. We (GC) never want 
to see an object with a non-zero klass in the header and an unset array 
length or oop_size. These fields are set up in 
CollectedHeap::post_allocation_install_obj_klass() (and neighbors), but 
there is no ordering enforced between the stores.

I think we're primarily worried by concurrent GC threads (G1 or CMS) 
seeing these new objects as they are being created. So we aren't 
concerned about young gen objects. There's some evidence that CMS is 
synchronizing access between allocators and concurrent scanners (see 
below), but I don't know if there are similar issues with G1.
>
>>> In any case, as soon as CMS uses this method for old gen
>>> allocation, it needs to have the necessary barriers (obviously).
>> I think for CMS, reading and writing are protected by the
>> cms_space->freelistLock(). For example,
>> the CMS sweeper holds the freelistLock. The Java thread trying to
>> allocate requests, then gets the freelistLock(), and the sweeper
>> re-aquires the freelistLock() before resuming the sweep (and
>> reading).
>>
>> So I'd think that there are plenty of fences for CMS?
> I would imagine that the exact ordering of the reads of these variables
> is important, not necessarily that before or after there are fences.
>
> Additional fences may only decrease the occurrences of this issue.
>
> Of course, if it is the case that both threads synchronize on the free
> list lock for allocation and reading respectively in the old gen, the
> code is fine.
I think this is the case. Certainly for concurrent sweeping.
>
> Thanks,
>    Thomas
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20160622/f18c2b15/attachment.htm>