RFR(S): G1: assert(_finger == _heap_end) failed, concurrentMark.cpp:809

Thu Mar 14 17:23:05 UTC 2013

Hi Everyone,

The new version of the webrev can be found at:

http://cr.openjdk.java.net/~johnc/8009940/webrev.1/

and Jon's comments in isolation can be found at:

http://cr.openjdk.java.net/~johnc/8009940/webrev.1.jons-review-comments/

Thanks,

JohnC

On 3/13/2013 1:21 PM, Jon Masamitsu wrote:
>
> On 3/13/2013 12:18 PM, John Cuthbertson wrote:
>> Hi Jon,
>>
>> Thank you for looking at the code changes.
>>
>> On 3/13/2013 11:28 AM, Jon Masamitsu wrote:
>>> John,
>>>
>>> No comments affecting the correctness of the
>>> change so looks good.
>>>
>>>
>>> 979   // If we're during the concurrent phase of marking, reset the 
>>> marking
>>>
>>> during -> doing ?
>>
>> OK. Whatever communicates that the overflow is seen while the 
>> concurrent phase is active.
>
> I would then use
>
> "If we're executing the concurrent phase ..."
>
>>
>>>
>>> From concurrentMark.hpp
>>>
>>>   // It should be called to indicate which phase we're in (concurrent
>>>   // mark or remark) and how many threads are currently active.
>>>   void set_phase(uint active_tasks, bool concurrent);
>>>
>>> Is there a better name that we can give to set_phase() so
>>> that we know if should only be called once per phase (if that
>>> is in fact true?)?  Maybe set_phase_on_entry() or
>>> initialize_phase() or set_phase_start().
>>>
>>> Also set_phase() seems to have much to do with the
>>> concurrency of the phase so maybe  set_concurrency(),
>>> set_concurrency_on_entr(), initialize_concurrency(),
>>> set_concurrency_start().
>>
>> Actually I like set_concurrency_and_phase(...).
>
> OK.
>
>>
>>> 2403   // Not strictly necessary but...
>>> 2404   //
>>> 2405   // We need to reset the number of workers in the parallel
>>> 2406   // task terminator, before each proxy task execution, so
>>> 2407   // that the termination protocol in CMTask::do_marking_step()
>>> 2408   // knows how many workers to wait for.
>>> 2409 _cm->terminator()->reset_for_reuse(_active_workers);
>>>
>>> If you're reusing a ParallelTaskTerminator, I think it really is best
>>> to not assume that it can be reused.
>>
>> OK. Kind of prompts the change I'll suggest below.
>>
>>>
>>> On 3/12/2013 5:10 PM, John Cuthbertson wrote:
>>>> Hi Everyone,
>>>>
>>>> Can I have a couple of volunteers look over these changes? The 
>>>> webrev can be found at: 
>>>> http://cr.openjdk.java.net/~johnc/8009940/webrev.0/
>>>>
>>>> These changes were, at Bengt's request, split out from the original 
>>>> webrev for 8009536. In the webrev they have been applied on top of 
>>>> 8009536 but they are independent. The issue has been in the G1 code 
>>>> for a while (since I added the reference processing code a couple 
>>>> of years ago) - we just never hit it.
>>>>
>>>> Anyway the assertion is caused by calling set_phase() - which 
>>>> throws the assertion failure - after resetting the marking state - 
>>>> which resets the value of the global finger.
>>>
>>> Is set_phase() actually safe to call other than the assertion that 
>>> fires
>>> if the global finger is reset?
>>
>> set_phase() was actually safe to call (other than the assert). Even 
>> resetting the global finger is, I believe, safe. Doing so means that 
>> we would mark object but would no longer push them - as they are no 
>> longer below the finger. Objects are pushed so that they can be 
>> traced. Since marking is going to be restarted, the marked objects 
>> would be traced in the normal manner when marking reaches them.
>>
>> Since it's not safe to reuse a parallel task terminator (I didn't 
>> know that) my suggestion is:
>
> Here's the more complete thought.  If you're going to reuse a parallel 
> task terminator and
> it has been used before, then reset_for_reuse() needs to be called on 
> it.  If it has not
> been used before, then it's safe to use without a call to 
> reset_for_reuse().  If in doubt
> or the use of the parallel task terminator could possibly change,
> call reset_for_reuse().
>
>>
>> * break out the concurrency-related code from set_phase() into it's 
>> own routine.
>> * call that routine within set_phase (now renamed 
>> set_concurrency_and_phase)
>> * call the set_concurrency routine within the parallel reference 
>> processing.
>>
>> That should also address Bengt's concern about the inconsistency 
>> between the parallel task terminator and the WorkGangBarrierSyncs.
>>
>> Sound reasonable?
>
> I like it.  Thanks;
>
> Jon
>
>>
>> Thanks!
>>
>> JohnC
>>
>