Frequently dereferenced weak references never get cleared
Ing. Michal Frajt (Luxonit s.r.o.)
michal.frajt at luxonit.com
Thu Mar 14 21:26:57 UTC 2019
We are evaluating ZGC for financial applications and so far have seen
promising results. However there seems to be an issue with handling of weak
references that is affecting our custom data distribution framework.
Since 2005 we are mainly using the CMS collector for financial applications
based on a custom data distribution framework. The framework is based on
weak references used similar way as smart pointers in C++. Application code
holds strong references where framework provides all data via weak
references only. A weak reference clear acts as an indication for the
framework that data is not required by the application code anymore. The
framework provided data are always coupled with a network resource
(receiving market updates) or CPU resource (computing
aggregations/risk/etc). The framework has huge interest to get the weak
reference cleared by the GC as it can deregister network subscription (less
data to read, less data to parse and handle) or stop CPU intensive
computations. As long as the weak reference is not cleared (reported via the
reference queue) the framework must permanently update provided data which
simply requires to dereference weak references regularly. Such dereferencing
happens on an average once per second for each weakly referenced object.
In order to get the weak references cleared regularly we were using the
incremental CMS (iCMS) which was always running in the background and
scanning the complete heap. When the iCMS got announced to be deprecated we
started working on a Hotspot extension and introduced the new
CMSTriggerInterval parameter into 8u40
(https://bugs.java.com/bugdatabase/view_bug.do?bug_id=8038265) to allow to
specify a maximum time between CMS collections. This way all our weak
references (which are not strongly referenced) are cleared within one or two
CMS iterations invoked by the CMSTriggerInterval or by another reason
(occupancy or explicit concurrent GC invoke for example). Usually we
configure the CMSTriggerInterval between 5 and 10 minutes.
Some years back (2012) we evaluated Azul C4 collector and recognized an
issue with the weak references processing. After some discussion we were
finally told by Gil Tene that C4 will never clear a weak reference if
dereferenced within the C4 major cycle (that time around each 5 minutes). We
were offered a C4 extension where a weak reference might be dereferenced but
the strong reference cannot be stored to any object, only kept at the
thread/call stack level. If stored, even for a short moment, it will again
require the full C4 major cycle without dereferencing it to get it cleared.
Because of this and other reasons our client finally did not purchased the
Azul C4 collector. We don't know how the C4 is addressing this issue in the
Today we got very pleased with the ZGC initial testing results. The test
scenario with CMS has ParNew STW 150ms every 10 seconds. It changed to 1ms
only every 5 minutes when using ZGC (if understanding the reporting
correctly). We immediately tested the weak references processing and
unfortunately observed the same behaviour as with the C4 collector. It seems
that weak references which are frequently dereferenced are never cleared.
Invoking the explicit concurrent GC by the jcmd GC.run does not help either.
Could you please explain us the ZGC weak references handling related to the
described scenario? Is there a way out for us?
Your input would be much appreciated.
More information about the zgc-dev