RFR(S): G1: assert(_finger == _heap_end) failed, concurrentMark.cpp:809

Jon Masamitsu jon.masamitsu at oracle.com
Wed Mar 13 18:28:04 UTC 2013


John,

No comments affecting the correctness of the
change so looks good.


979   // If we're during the concurrent phase of marking, reset the marking

during -> doing ?

 From concurrentMark.hpp

   // It should be called to indicate which phase we're in (concurrent
   // mark or remark) and how many threads are currently active.
   void set_phase(uint active_tasks, bool concurrent);

Is there a better name that we can give to set_phase() so
that we know if should only be called once per phase (if that
is in fact true?)?  Maybe set_phase_on_entry() or
initialize_phase() or set_phase_start().

Also set_phase() seems to have much to do with the
concurrency of the phase so maybe  set_concurrency(),
set_concurrency_on_entr(), initialize_concurrency(),
set_concurrency_start().



2403   // Not strictly necessary but...
2404   //
2405   // We need to reset the number of workers in the parallel
2406   // task terminator, before each proxy task execution, so
2407   // that the termination protocol in CMTask::do_marking_step()
2408   // knows how many workers to wait for.
2409   _cm->terminator()->reset_for_reuse(_active_workers);

If you're reusing a ParallelTaskTerminator, I think it really is best
to not assume that it can be reused.


On 3/12/2013 5:10 PM, John Cuthbertson wrote:
> Hi Everyone,
>
> Can I have a couple of volunteers look over these changes? The webrev 
> can be found at: http://cr.openjdk.java.net/~johnc/8009940/webrev.0/
>
> These changes were, at Bengt's request, split out from the original 
> webrev for 8009536. In the webrev they have been applied on top of 
> 8009536 but they are independent. The issue has been in the G1 code 
> for a while (since I added the reference processing code a couple of 
> years ago) - we just never hit it.
>
> Anyway the assertion is caused by calling set_phase() - which throws 
> the assertion failure - after resetting the marking state - which 
> resets the value of the global finger.

Is set_phase() actually safe to call other than the assertion that fires
if the global finger is reset?

Jon


>
> The solution was a three pronged approach:
>
> 1. Skip reference processing if the marking stack overflowed during 
> the actual remark task.
>
> This is safe to do as reference objects are only discovered once. So 
> once a reference object is discovered it will remain on the discovered 
> list until it is processed. Since, as a result of the overflow, 
> marking is going to restart - the references will be processed after a 
> successful remark. We could use the routine 
> abandon_partial_discovery() and depopulate the discovered lists but 
> the only benefit to doing this is that we may discover less references 
> during the subsequent restart.
>
> 2. Remove the set_phase() calls when we execute a new ref processing 
> proxy task
>
> To do this we need to call set_phase() from within weakRefsWork() and 
> call the terminator's reset_for_reuse() routine when we execute a new 
> ref processing task. This will remove the path that checks the assert.
>
> 3. Worker 0 only resets the marking state during the current phase of 
> marking.
>
> During the STW remark pause there is another call to reset the marking 
> stack (in the event of overflow) immediately after the discovered 
> references are processed:
>
>   double start = os::elapsedTime();
>
>   checkpointRootsFinalWork();
>
>   double mark_work_end = os::elapsedTime();
>
>   weakRefsWork(clear_all_soft_refs);
>
>   if (has_overflown()) {
>     // Oops. We overflowed. Restart concurrent marking.
>     _restart_for_overflow = true;
>     // Clear the marking state because we will be restarting
>     // marking due to overflowing the global mark stack.
>     reset_marking_state();
>     if (G1TraceMarkStackOverflow) {
>       gclog_or_tty->print_cr("\nRemark led to restart for overflow.");
>     }
>
> I also fixed the British spelling of initialized in a few places.
>
> Testing:
> The lucene tests with parallel reference processing (with and without 
> verification).
> specjvm98 with parallel reference processing (with and without 
> verification).
>
> Thanks,
>
> JohnC




More information about the hotspot-gc-dev mailing list