RFR: 8069367: assert(_nextMarkBitMap->isMarked((HeapWord*) obj)) failed

Bengt Rutisson bengt.rutisson at oracle.com
Mon Mar 9 13:19:57 UTC 2015


On 2015-03-09 13:24, Thomas Schatzl wrote:
> Hi Bengt,
>
> On Mon, 2015-03-09 at 12:49 +0100, Bengt Rutisson wrote:
>> Hi Kim,
>>
>> On 2015-03-06 19:10, Kim Barrett wrote:
>>> Please review this change to fix a problem in the interaction between
>>> G1 concurrent marking and eager reclaim of humongous objects.
>>>
>>> I will need a sponsor for this change.
>>>
>>> The scenario we are dealing with is
>>> [...]
>>>
>>> The additional test in concurrent marking imposes a small performance
>>> degradation on concurrent marking.  Measurements of a program which
>>> allocates a substantial number of objects and then does nothing but
>>> repeatedly GC shows a fraction of a percent increase in concurrent
>>> mark time, which is well within the variance for even this contrived
>>> test.  Aurora performance comparison showed no significant negative
>>> impact.  Alternatives that preclean the mark stack when humongous
>>> objects are reclaimed get complicated when attempting to do so without
>>> extending the reclaiming evacuation pause.
>> Thanks for providing such a detailed descriptions about the problem and
>> solution!
>>
>> One question. I assume that this situation can only occur if the
>> humongous object was live before the marking started (otherwise it would
>> have already been filtered out since it would have TAMS == bottom) and
>> someone has removed the reference to the humongous object while we were
>> marking.
>>
>> Here's an attempt to show what I mean in a diagram:
>>
>> H = new Humongous(),;
>> A.h = H;
>> <G1 initial mark>
>> <Marking scans A and pushes H on the mark stack>
>> A.h = null;
>> <G1 young GC>
>> <H is reclaimed since no one references it>
>> <Marking continues and finds H on the mark stack>
>>
>> Is this what is happening? In that case, isn't this violating the SATB
>> invariant that anything that was live when marking started is considered
>> live when it ends?
> Yes. That has already been a concern with the original eager reclaim.
>
>> Your fix will make sure the marking doesn't crash,
>> but doesn't this behavior (even prior to your fix) cause other problems?
> None that I know. The eager reclaim already made sure that there is no
> other reference from a live object to the reclaimed object on the heap,
> assuming the remembered sets were correct. So nobody else can
> dereference the object.
> There is the mentioned race where mark stacks had some references to
> these objects left.

Ok. So, what happens if I modify the example a little bit?

H = new Object[BIGNUMBER];
H[4711] = new B();
A.h = H;
<G1 initial mark>
<Marking scans A and pushes H on the mark stack>
A.h = A.h[4711];
<G1 young GC>
<H is reclaimed since no one references it>
<Marking continues and finds H on the mark stack but skips it>

Who will discover B and mark it live?

Thanks,
Bengt

>
> Ideally for this case, mark stacks were organized on a per region basis,
> so you could just drop them during eager reclaim (or if any region
> provably becomes known empty and unreferenced). That's what I think what
> Kim refers to "being complicated" to do.
>
> The conservative fix would be to disable eager reclaim during marking.
> This has its own disadvantages: Applications where eager reclaim
> matters, are often continuously marking. even if not, marking is only
> done when space is already tight. So disabling eager reclaim during
> marking seems quite counterproductive.
>
> Heap verification should be okay too: while we walk through dead objects
> on the heap (that may still contain references to that reclaimed
> humongous objects), we do not check their references.
>
> Thanks,
>    Thomas
>
>




More information about the hotspot-gc-dev mailing list