RFR: 8373203: Genshen: Non-strong reference leak in old gen [v8]

Y. Srinivas Ramakrishna ysr at openjdk.org
Sat Jan 10 01:00:22 UTC 2026


On Fri, 9 Jan 2026 17:52:35 GMT, William Kemper <wkemper at openjdk.org> wrote:

>> The generational mode for Shenandoah will collect _referents_ for the generation being collected. For example, if we have a young reference pointing to an old referent, that young reference will be processed after we finish marking the old generation. This presents a problem for discovery.
>>   
>> When the young mark _encounters_ a young reference with an old referent, it cannot _discover_ it because old marking hasn't finished. However, if it does not discover it, the old referent will be strongly marked. This, in turn, will prevent the old generation from clearing the referent (if it even reaches it again during old marking).
>>   
>> To solve this, we let young reference processing discover the old reference by having it use the old generation reference processor to do so. This means the old reference processor can have a discovered list that contains young weak references. If any of these young references reside in a region that is collected, old reference processing will crash when it processes such a reference. Therefore, we add a method `heal_discovered_lists` to traverse the discovered lists after young evacuation is complete. The method will replace any forwarded entries in the discovered list with the forwardee.
>> 
>> This PR also extends whitebox testing support for Shenandoah, giving us the ability to trigger young/old collections and interrogate some properties of heaps and regions.
>
> William Kemper has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 23 commits:
> 
>  - Merge remote-tracking branch 'jdk/master' into fix-old-reference-processing
>  - Merge remote-tracking branch 'jdk/master' into fix-old-reference-processing
>  - Heal discovered lists for any young collection coincides with old marking
>  - Configure thread local mark closure on delegated old reference processor
>  - Merge remote-tracking branch 'jdk/master' into fix-old-reference-processing
>  - Fix idiosyncratic white space in whitebox
>    
>    Co-authored-by: Stefan Karlsson <stefan.karlsson at oracle.com>
>  - Sort includes
>  - Heal old discovered lists in parallel
>  - Fix comment
>  - Factor duplicate code into shared method
>  - ... and 13 more: https://git.openjdk.org/jdk/compare/f5fa9e40...abccb8b6

> When the young mark encounters a young reference with an old referent, it cannot discover it because old marking hasn't finished. However, if it does not discover it, the old referent will be strongly marked. This, in turn, will prevent the old generation from clearing the referent (if it even reaches it again during old marking).

Naive question: The basic issue appears to be that the marking state of the referent may not be visible to the gc that is processing the reference when they are in different generations. If we wait for them to both be in the same generation, the same marking will both discover the reference and know the reachability of the referent whence it can be collected. Why can't we just wait for the reference and referent to both be tenured into the old generation before they are processed? I realize this delays processing until such time that we have either a global marking, a full collection, or the reference and referent both end up in the old generation. 

It is possible I misunderstood the original problem. 

Can you explain what is causing the leak here? i.e. what causes us not to eventually discover and process the reference when both it and its referent are in the old generation? Why does the `should_discover()` on the reference in the old generation return false when its referent is also in the same genereation.

Many years ago the concept of this visibility across generations was handled by means of a "span" (of marking visiblity if you will) that the reference processor carried, so that it would leave alone and not discover references whose referent was in a different generation. The same issue existed in CMS where it was dealt with by allowing the reference and referent to both migrate into the same (old) generation at which point reference processing would deal with it because it had full reachability visibility at that point on. Here we seem to be considering a leak where we are left in a state where we never discover the reference even after it and its referent are both in the old generation. How does that happen?

-------------

PR Comment: https://git.openjdk.org/jdk/pull/28810#issuecomment-3731179370


More information about the hotspot-dev mailing list