Long pause for processing weakrefs in DaCapo benchmark

Mon May 18 06:58:02 UTC 2020

Hi,

On 5/18/20 3:16 AM, raell at web.de wrote:
> Is there a parameter that can be set or something else what can be done to reduce the
> maximal pause time significantly?

Class Unloading should not matter here. Weak reference processing is STW in Shenandoah due to
implementation reasons (like in Parallel, G1, etc). Indeed, if you run the workload with
-Xlog:ref*=debug, it would point to lots of finalizers:

GC(20) Reference Processing: 50.2ms
GC(20)   Reconsider SoftReferences: 0.0ms
GC(20)     SoftRef (ms):             skipped
GC(20)   Notify Soft/WeakReferences: 3.6ms
GC(20)     Balance queues: 2.1ms
GC(20)     SoftRef (ms):     Min:  0.0, Avg:  0.0, Max:  0.0, Diff:  0.0, Sum:  0.0, Workers: 3
GC(20)     WeakRef (ms):     Min:  0.0, Avg:  0.0, Max:  0.0, Diff:  0.0, Sum:  0.0, Workers: 3
GC(20)     FinalRef (ms):    Min:  0.9, Avg:  1.0, Max:  1.3, Diff:  0.4, Sum:  3.0, Workers: 3
GC(20)     Total (ms):       Min:  1.3, Avg:  1.3, Max:  1.3, Diff:  0.0, Sum:  3.9, Workers: 3
GC(20)   Notify and keep alive finalizable: 46.5ms
GC(20)     Balance queues: 0.0ms
GC(20)     FinalRef (ms):    Min: 46.4, Avg: 46.4, Max: 46.4, Diff:  0.0, Sum: 139.2, Workers: 3
GC(20)   Notify PhantomReferences: 0.1ms
GC(20)     PhantomRef (ms):  Min:  0.0, Avg:  0.0, Max:  0.0, Diff:  0.0, Sum:  0.0, Workers: 1
GC(20)   SoftReference:
GC(20)     Discovered: 0
GC(20)     Cleared: 0
GC(20)   WeakReference:
GC(20)     Discovered: 0
GC(20)     Cleared: 0
GC(20)   FinalReference:
GC(20)     Discovered: 20807
GC(20)     Cleared: 0
GC(20)   PhantomReference:
GC(20)     Discovered: 32
GC(20)     Cleared: 6

Finalizers are very unfortunate class of weak references: they appear dead during the marking time,
and then reference processing has to make them, and all their descendants, alive again -- basically
marking through the new parts of object graph. That takes a while, as we see here. It is,
thankfully, quite rare in current applications, but sometimes it gets discovered on these special
workloads. Our go-to answer to customers in these cases is, "Try not to use finalizers to begin with".

Anyhow, there are two ways to mitigate the pauses:

 1. Notice the log above says "Workers: 3". That is ergonomic decision to run the whole thing with 3
GC threads. You can override it with -XX:ParallelGCThreads=6, for example. This would balance the
work across more threads, hopefully improving the wall clock time.

 2. Having more frequent ref processing would make this processing more incremental.
-XX:ShenandoahRefProcFrequency=1 should help here.

> The benchmark has been runing on Debian using AdoptOpenJDK jdk-14+36.

For evaluation purposes targeting 14u, I'd run with 14u nightlies, which include all improvements
that would be released with 14.0.2 in July: https://builds.shipilev.net/openjdk-jdk14/

There are many interesting bugs/potholes fixed there. Notably, you should not see "Final Evac"
pauses anymore.

-- 
Thanks,
-Aleksey