RFR(S): G1: assert(_finger == _heap_end) failed, concurrentMark.cpp:809
John Cuthbertson
john.cuthbertson at oracle.com
Wed Mar 13 00:10:25 UTC 2013
Hi Everyone,
Can I have a couple of volunteers look over these changes? The webrev
can be found at: http://cr.openjdk.java.net/~johnc/8009940/webrev.0/
These changes were, at Bengt's request, split out from the original
webrev for 8009536. In the webrev they have been applied on top of
8009536 but they are independent. The issue has been in the G1 code for
a while (since I added the reference processing code a couple of years
ago) - we just never hit it.
Anyway the assertion is caused by calling set_phase() - which throws the
assertion failure - after resetting the marking state - which resets the
value of the global finger.
The solution was a three pronged approach:
1. Skip reference processing if the marking stack overflowed during the
actual remark task.
This is safe to do as reference objects are only discovered once. So
once a reference object is discovered it will remain on the discovered
list until it is processed. Since, as a result of the overflow, marking
is going to restart - the references will be processed after a
successful remark. We could use the routine abandon_partial_discovery()
and depopulate the discovered lists but the only benefit to doing this
is that we may discover less references during the subsequent restart.
2. Remove the set_phase() calls when we execute a new ref processing
proxy task
To do this we need to call set_phase() from within weakRefsWork() and
call the terminator's reset_for_reuse() routine when we execute a new
ref processing task. This will remove the path that checks the assert.
3. Worker 0 only resets the marking state during the current phase of
marking.
During the STW remark pause there is another call to reset the marking
stack (in the event of overflow) immediately after the discovered
references are processed:
double start = os::elapsedTime();
checkpointRootsFinalWork();
double mark_work_end = os::elapsedTime();
weakRefsWork(clear_all_soft_refs);
if (has_overflown()) {
// Oops. We overflowed. Restart concurrent marking.
_restart_for_overflow = true;
// Clear the marking state because we will be restarting
// marking due to overflowing the global mark stack.
reset_marking_state();
if (G1TraceMarkStackOverflow) {
gclog_or_tty->print_cr("\nRemark led to restart for overflow.");
}
I also fixed the British spelling of initialized in a few places.
Testing:
The lucene tests with parallel reference processing (with and without
verification).
specjvm98 with parallel reference processing (with and without
verification).
Thanks,
JohnC
More information about the hotspot-gc-dev
mailing list