Weak Ref Processing

Mon Apr 1 15:34:34 UTC 2019

>> I'm preparing for a conference session about Shenandoah (and ZGC), and
>> I would like to confirm how Shenandoah does Weak (et al.) Reference
>> Processing.
>>
>> As far as I understand, weak reference processing is still a STW
>> operation during the final mark phase, so GC parallel but not GC
>> concurrent. Is that correct?
> 
> Depends on what "processing" means. The performance model for weakrefs is a bit complicated.
> 
> Discovery (figuring out the references that exist in heap) is still done concurrently during
> concurrent mark. Precleaning (purging the references that have definitely alive referents) is done
> concurrently after mark. Processing (figuring out what to do with references that have dead
> referents, for example, clearing them or marking through the reachable referent) is done in
> final-mark STW pause. Enqueueuing (putting references on associated ReferenceQueue, if any), is done
> in final-mark STW pause as well.
> 
> So, the pause time depends on the weak references _churn_. For references+referents that stay alive
> there is little to no overhead (because precleaning takes care of them). For references that die
> along with referents, the cost is zero (because discovery never finds them). For references that
> have always clear referents (i.e. everything except finalizable), the processing cost does not
> involve marking through the suddenly reachable subgraph, so the cost is also low, but not as low as
> you might want, hence the desire to have concurrent reference processing.
> 
> Reference processing is performed by multiple parallel workers. When in pause, it is guided by
> ParallelGCThreads. In concurrent mode, by ConcGCThreads, and mostly piggybacking on the actual GC
> threads that do the marking.

Summary is that for most normal-behaving applications, the extra cost 
for ref-processing is very few ms.

To make it a little more precise: the only way we can hit 
until-then-undiscovered reachable subgraphs is via finalizers. If your 
application doesn't make funny use of it, you should be good. However, 
this new 'finalizable' subgraph is theoretically unbounded, and marking 
through that newly discovered subgraphs behind finalizers can, 
theoretically, involve marking through a significant part of the heap 
and cost time. In practice, we've never seen that happen (most 
applications nowadays seem to be well-behaved ;-) )

>> Is there any plan to make it concurrent, if not already?
> 
> Roman had a prototype, I would let him talk about it.

Yeah, we have a plan, and I have some code that already implements part 
of it, but no ETA when it will be ready. I keep getting side-tracked 
with $STUFF. ;-)

Roman