RFR(S): 8004816: G1: Kitchensink failures after marking stack changes
John Cuthbertson
john.cuthbertson at oracle.com
Tue Dec 18 02:06:29 UTC 2012
Hi Everyone,
Here's a new webrev based upon the comments from Jon:
http://cr.openjdk.java.net/~johnc/8004816/webrev.1/
Thanks,
JohnC
On 12/14/12 06:51, Jon Masamitsu wrote:
> John,
>
> Your choice of reset_marking_state() below is fine.
>
> Jon
>
> On 12/13/2012 3:27 PM, John Cuthbertson wrote:
>> Hi Jon,
>>
>> Thanks. I agree a name change would OK. Any particular preference?
>>
>> Currently clear_marking_state() is called on the following paths....
>>
>> The initial mark path...
>>
>> CheckpointRootsInitialPre -> reset() -> clear_marking_state().
>>
>> ConcurrentMark initialization path...set_non_marking_state clears
>> that marking is in progress.
>>
>> ConcurrentMark() -> set_non_marking_state() -> clear_marking_state().
>>
>> Marking is finished path (from remark)....
>>
>> CheckpointRootsFinal -> set_non_marking_state() -> clear_marking_state()
>>
>> Overflow in remark path (1)....
>>
>> CheckpointRootsFinal -> clear_marking_state()
>>
>> Overflow from CMTask::do_marking_step() (overflow during actual
>> marking or remark) by worker 0 only....
>>
>> do_marking_step() -> enter_first_barrier_sync() ->
>> clear_marking_state().
>>
>> I think that's all of them. Given these, my preference would be
>> reset_marking_state().
>>
>> Thanks again for the review.
>>
>> JohnC
>>
>> On 12/13/12 15:12, Jon Masamitsu wrote:
>>> John,
>>>
>>> Your fix looks good.
>>>
>>> A small suggestion. Instead of clear_marking_state() how
>>> about reinitialize_marking_state() or reset_marking_state()?
>>> When I went to look at what "clear_marking_state"
>>> did, I expected it to simply set a variable but it does
>>> more. I thought that "reinitialize" or "reset" would
>>> warn the reader that more was happening.
>>> If this is in keeping with G1 naming
>>> convention, it's fine to leave it.
>>>
>>> Jon
>>>
>>>
>>>
>>> On 12/13/12 09:45, John Cuthbertson wrote:
>>>> Hi Everyone,
>>>>
>>>> Can I have a couple of volunteers review the fix for this CR? The
>>>> webrev can be found at:
>>>> http://cr.openjdk.java.net/~johnc/8004816/webrev.0/
>>>>
>>>> Summary:
>>>> The reduced mark stack size that came in as part of the changes for
>>>> 8000244 exposed this issue. If the marking stack overflowed as a
>>>> result of pushes from the serial reference processing closures, the
>>>> marking stack overflow flag was not cleared. When marking
>>>> eventually completed, after the subsequent restarting, the still
>>>> set overflow flag was detected and resulted in a guarantee failure.
>>>> I also discovered that the actual marking state was not correctly
>>>> cleared for restarting for such an overflow and, as a result, some
>>>> referent objects might have been skipped by marking (those that are
>>>> reachable from the objects we failed to push as a result of the
>>>> overflow).
>>>>
>>>> Normally this is not a problem as the serial reference processing
>>>> code is executed infrequently - except on systems with one or two
>>>> cpus. The parallel reference processing closures reset the marking
>>>> state correctly in the event of an overflow in the global marking
>>>> stack.
>>>>
>>>> Testing:
>>>> Kitchensink on 1 and 2 cpu systems with a very low mark stack size
>>>> and marking verification.
>>>>
>>>> Thanks,
>>>>
>>>> JohnC
>>
More information about the hotspot-gc-dev
mailing list