RFR: 8069367: assert(_nextMarkBitMap->isMarked((HeapWord*) obj)) failed

Thu Mar 12 21:12:32 UTC 2015

On Mar 12, 2015, at 4:32 AM, Bengt Rutisson <bengt.rutisson at oracle.com> wrote:
> 
> 
> Hi Kim and Thomas,
> 
> On 2015-03-11 22:16, Kim Barrett wrote:
>> Here’s the new webrev:
>> 
>> Full:
>> http://cr.openjdk.java.net/~kbarrett/8069367/webrev.01/
>> 
>> Incremental:
>> http://cr.openjdk.java.net/~kbarrett/8069367/webrev.01.incr/
>> 
>> I haven’t done a full Aurora performance run with this new version, but the synthetic benchmark
>> Thomas gave me isn’t showing any degradation.  (In fact, once again today, the new version
>> is running a tiny bit faster on that benchmark, e.g. the real difference is pretty much in the noise.)
> 
> I've been thinking a bit about the approach to solve the mark issue and I have to admit that I don't really have a good feeling about how this is currently solved. The eager reclamation knowledge is leaked in to a lot of places and we don't have any explicit way of checking what has actuallly happened. All we can do is guess that it was eager reclamation that caused things to look in unexpected ways. To me this is very fragile an might even hide other real problems that cause similar symptoms.
> 
> Can we find some other way of handling this more explicitly?

I have a somewhat similar reaction.  One thing to keep in mind is that
we probably want to backport a fix for this to 8u60.

> One thought I had was if we can note the fact that we push a humongous object on to a mark stack. Then the young GC can just check if any mark stack contains a humongous object and in that case either skip reclaiming that object or go and fix the mark stack after it has been reclaimed.

That's easy to say, but I think not so easy to do in an acceptable
way. A key point is that we don't want to increase pause times to deal
with this. Simply looking through the mark stacks (global and
CMTask-local) as part of reclaiming a humongous object probably
doesn't cut it.

Thomas and I've spent some time talking about alternatives.

Thomas suggested splitting up the currently single global mark stack
to be per-region.  That would allow easily discarding the associated
(at most one) mark stack entry when eagerly reclaiming a humongous
object.  It might have other locality benefits too.

That would still leave the CMTask-local entries to deal with.  Those
might be addressed by purging them at CMTask abort time, transferring
their contents back to the global mark stack(s).  That introduces some
additional delay and possibly lock contention to that abort process,
and some startup cost to pull data back in when the task is continued.

Rather than purging at abort, we could filter the CMTask-local entries
at continuation.  That would require some care to deal with one task
attempting to steal entries from another that hasn't completed the
filtering process.  However, anothing along this line still requires a
filtering predicate, e.g. is_stale_humongous_marked_entry.

This is the sort of thing I was talking about when I referred to
alternatives "being complicated".

Segregating (potentially reclaimable) humongous objects from others in
the mark stack (in order to limit the mark stack cleanup work) will
add overhead to the marking process too.

We could try to make is_stale_humongous_marked_entry less heuristic.
Perhaps a bit per region; all clear at the start of a concurrent
marking cycle. When eagerly reclaiming a humongous object, set the bit
for the start region. Filter by testing the bit. (Actual
implementation might be a byte, to avoid additional bit manipulation
overhead.) This could be done independently of any changes related to
the mark stacks.  The answer *should* be the same either way, but a
more direct mechanism might better survive future changes.