RFR (9) 8185133: Reference pending list root might not get marked

Mikael Gerdin mikael.gerdin at oracle.com
Mon Jul 31 05:40:19 UTC 2017


Hi Kim,

On 2017-07-28 21:53, Kim Barrett wrote:
>> On Jul 28, 2017, at 1:20 PM, Erik Osterlund <erik.osterlund at oracle.com> wrote:
>>
>> Hi Roman,
>>
>>> On 28 Jul 2017, at 16:53, Roman Kennke <rkennke at redhat.com> wrote:
>>>
>>> Hi Mikael,
>>>
>>> I don't really understand what the problem is. The WR ends up on the
>>> RPL, with its referent cleared, i.e. no longer pointing to the SR? But
>>> we want to keep the SR alive?
>>
>> No. The WR gets promoted to old during the initial mark evacuation as it was strongly reachable by a SR in young. The referent of the WR died, and therefore it gets discovered. The assumption is then that since it was strongly reachable from the SR in young, the WR will be found during concurrent marking due to SATB. This is normally a safe assumption.
>>
>> However, just before finishing the initial mark pause and letting concurrent marking start trace through the heap, soft references may change strength to suddenly become weak. Therefore, the WR in old never gets marked during concurrent marking unless the GC is made aware of the existence of this new strong edge to the pending list head.
>>
>> This is a problem, because the pending list was in this scenario exposed to Java land through the pending list head, without the concurrent marking knowing about it, violating GC completeness.
> 
> I think SR also needs to be promoted by the initial-mark pause.  If SR
> is young and not promoted, then it will be a survivor of the
> initial-mark pause, and so will be scanned by scan_root_regions.
> scan_root_regions doesn't do reference processing, so the scan of the
> survivor SR will mark WR.
> 
> Here's my understanding of the problem scenario:
> 
> (1) initial state
> 
> SR => WR => O
> WR, and O are young
> WR and O are unreachable except through the chain from SR
> SR has not expired
> 
> (2) initial_mark
> 
> SR and WR are both promoted to oldgen.
> SR is not discovered, because it has not expired.
> WR is discovered and enqueued, because O is unreachable.
> WR ends up at the head of the pending list.  This happens after the
> initial root scan has examined the head of the pending list.
> 
> (3) SR expires
> 
> We now have an oldgen WR in the pending list, and no certain path by
> which concurrent marking will reach it, even though it is accessible.
> (The Java reference processing thread might process and discard it
> before any damage is actually done, but that's far from certain.)
> 
> So it requires a fairly unlikely sequence of events.
> 
> Note: If WR ends up anywhere other than at the head of the pending
> list, it will eventually be visited, either by scan_root_region or
> normal concurrent marking, depending on its predecessor in the list.
> (Assuming its predecessor is not another similar case that *did* end
> up at the head of the list.)

Thanks for this detailed explanation.
/Mikael

> 



More information about the hotspot-gc-dev mailing list