RFR(S): 8004816: G1: Kitchensink failures after marking stack changes
John Cuthbertson
john.cuthbertson at oracle.com
Thu Dec 13 23:27:44 UTC 2012
Hi Jon,
Thanks. I agree a name change would OK. Any particular preference?
Currently clear_marking_state() is called on the following paths....
The initial mark path...
CheckpointRootsInitialPre -> reset() -> clear_marking_state().
ConcurrentMark initialization path...set_non_marking_state clears that
marking is in progress.
ConcurrentMark() -> set_non_marking_state() -> clear_marking_state().
Marking is finished path (from remark)....
CheckpointRootsFinal -> set_non_marking_state() -> clear_marking_state()
Overflow in remark path (1)....
CheckpointRootsFinal -> clear_marking_state()
Overflow from CMTask::do_marking_step() (overflow during actual marking
or remark) by worker 0 only....
do_marking_step() -> enter_first_barrier_sync() -> clear_marking_state().
I think that's all of them. Given these, my preference would be
reset_marking_state().
Thanks again for the review.
JohnC
On 12/13/12 15:12, Jon Masamitsu wrote:
> John,
>
> Your fix looks good.
>
> A small suggestion. Instead of clear_marking_state() how
> about reinitialize_marking_state() or reset_marking_state()?
> When I went to look at what "clear_marking_state"
> did, I expected it to simply set a variable but it does
> more. I thought that "reinitialize" or "reset" would
> warn the reader that more was happening.
> If this is in keeping with G1 naming
> convention, it's fine to leave it.
>
> Jon
>
>
>
> On 12/13/12 09:45, John Cuthbertson wrote:
>> Hi Everyone,
>>
>> Can I have a couple of volunteers review the fix for this CR? The
>> webrev can be found at:
>> http://cr.openjdk.java.net/~johnc/8004816/webrev.0/
>>
>> Summary:
>> The reduced mark stack size that came in as part of the changes for
>> 8000244 exposed this issue. If the marking stack overflowed as a
>> result of pushes from the serial reference processing closures, the
>> marking stack overflow flag was not cleared. When marking eventually
>> completed, after the subsequent restarting, the still set overflow
>> flag was detected and resulted in a guarantee failure. I also
>> discovered that the actual marking state was not correctly cleared
>> for restarting for such an overflow and, as a result, some referent
>> objects might have been skipped by marking (those that are reachable
>> from the objects we failed to push as a result of the overflow).
>>
>> Normally this is not a problem as the serial reference processing
>> code is executed infrequently - except on systems with one or two
>> cpus. The parallel reference processing closures reset the marking
>> state correctly in the event of an overflow in the global marking stack.
>>
>> Testing:
>> Kitchensink on 1 and 2 cpu systems with a very low mark stack size
>> and marking verification.
>>
>> Thanks,
>>
>> JohnC
More information about the hotspot-gc-dev
mailing list