Frequently dereferenced weak references never get cleared

Ing. Michal Frajt (Luxonit s.r.o.) michal.frajt at luxonit.com
Fri Mar 15 09:52:23 UTC 2019


Hi Peter,

> Regardless, my knee-jerk reaction when I read your message was “why don’t
> they change their framework to remove it’s use of weak references?”

it was February 2002 when I found Weak References as the only close replacement for the C++ reference counting. Since then we have 2 million lines of the complex application code written on that framework where nobody cares about returning data. All framework provided data is shared across application modules using the RW locking, the data is mutated by framework data providers and read by application modules. A simple bug in an application module where and object is returned twice or not returned at all (for example wrong or missing exception case handling) would break data for other modules or lead to never releasing it. Introducing for example any concept of tokens for leasing data would be only unwanted complication. Current deployment is handling 100m+ distributed data objects with more than 1 million updates per second all running across 200 machines. Code changes possible but nothing simple.

>About a dozen years ago I’d concluded that weak
> references were much less useful than they sounded, because they seemed
> to require an additional GC cycle to be collected compared to unreferenced
> objects. Perhaps this wasn’t an obstacle in your scenario?

The additional GC cycle is actually fitting very much our need. Sometimes we even delay the weak reference clearing by holding the strong reference in a timer not to release our data immediately as it might be required very soon again. For example when a complex risk structure is provided for a single user accessing it from the UI we want the provided data to survive possible UI restart or machine crash time. Sure there are still some cases where we wished to have the weak reference cleared immediately but finally we decided not to start playing with the reference counting hell (in Java).

Michal

> -----Původní zpráva-----
> Od: Peter Booth <peter_booth at me.com>
> Odesláno: pátek 15. března 2019 1:22
> Komu: Ing. Michal Frajt (Luxonit s.r.o.) <michal.frajt at luxonit.com>
> Kopie: zgc-dev at openjdk.java.net
> Předmět: Re: Frequently dereferenced weak references never get cleared
> 
> Michal,
> 
> I read your email with interest and my immediate reaction was “you’ve done
> well”. If I understand you correctly, you have built, and successfully used for
> over a decade, a framework that depends upon the prompt collection of
> weakly referenced objects. My strong reaction was because I hadn’t realized
> such a thing was practical. About a dozen years ago I’d concluded that weak
> references were much less useful than they sounded, because they seemed
> to require an additional GC cycle to be collected compared to unreferenced
> objects. Perhaps this wasn’t an obstacle in your scenario?
> 
> Regardless, my knee-jerk reaction when I read your message was “why don’t
> they change their framework to remove it’s use of weak references?”
> I’ve used C4 for about seven years now, both on Vega hardware and the Zing
> product, and I’ve enjoyed the benefits. I’m currently working on apps that
> use CMS and G1 and it feels like a step backwards. ZGC is an interesting
> alternative, in theory.
> 
> Peter
> 
> 
> 
> 
> Sent from my iPhone
> 
> > On Mar 14, 2019, at 5:26 PM, Ing. Michal Frajt (Luxonit s.r.o.)
> <michal.frajt at luxonit.com> wrote:
> >
> > Hi all,
> >
> >
> >
> > We are evaluating ZGC for financial applications and so far have seen
> > promising results. However there seems to be an issue with handling of
> > weak references that is affecting our custom data distribution framework.
> >
> >
> >
> > Since 2005 we are mainly using the CMS collector for financial
> > applications based on a custom data distribution framework. The
> > framework is based on weak references used similar way as smart
> > pointers in C++. Application code holds strong references where
> > framework provides all data via weak references only. A weak reference
> > clear acts as an indication for the framework that data is not
> > required by the application code anymore. The framework provided data
> > are always coupled with a network resource (receiving market updates)
> > or CPU resource (computing aggregations/risk/etc). The framework has
> > huge interest to get the weak reference cleared by the GC as it can
> > deregister network subscription (less data to read, less data to parse
> > and handle) or stop CPU intensive computations. As long as the weak
> > reference is not cleared (reported via the reference queue) the
> > framework must permanently update provided data which simply requires
> > to dereference weak references regularly. Such dereferencing happens on
> an average once per second for each weakly referenced object.
> >
> >
> >
> > In order to get the weak references cleared regularly we were using
> > the incremental CMS (iCMS) which was always running in the background
> > and scanning the complete heap. When the iCMS got announced to be
> > deprecated we started working on a Hotspot extension and introduced
> > the new CMSTriggerInterval parameter into 8u40
> > (https://bugs.java.com/bugdatabase/view_bug.do?bug_id=8038265) to
> > allow to specify a maximum time between CMS collections. This way all
> > our weak references (which are not strongly referenced) are cleared
> > within one or two CMS iterations invoked by the CMSTriggerInterval or
> > by another reason (occupancy or explicit concurrent GC invoke for
> > example). Usually we configure the CMSTriggerInterval between 5 and 10
> minutes.
> >
> >
> >
> > Some years back (2012) we evaluated Azul C4 collector and recognized
> > an issue with the weak references processing. After some discussion we
> > were finally told by Gil Tene that C4 will never clear a weak
> > reference if dereferenced within the C4 major cycle (that time around
> > each 5 minutes). We were offered a C4 extension where a weak reference
> > might be dereferenced but the strong reference cannot be stored to any
> > object, only kept at the thread/call stack level. If stored, even for
> > a short moment, it will again require the full C4 major cycle without
> dereferencing it to get it cleared.
> > Because of this and other reasons our client finally did not purchased
> > the Azul C4 collector. We don't know how the C4 is addressing this
> > issue in the current implementation.
> >
> >
> >
> > Today we got very pleased with the ZGC initial testing results. The
> > test scenario with CMS has ParNew STW 150ms every 10 seconds. It
> > changed to 1ms only every 5 minutes when using ZGC (if understanding
> > the reporting correctly). We immediately tested the weak references
> > processing and unfortunately observed the same behaviour as with the
> > C4 collector. It seems that weak references which are frequently
> dereferenced are never cleared.
> > Invoking the explicit concurrent GC by the jcmd GC.run does not help
> either.
> >
> >
> >
> > Could you please explain us the ZGC weak references handling related
> > to the described scenario? Is there a way out for us?
> >
> >
> >
> > Your input would be much appreciated.
> >
> >
> >
> > Best regards,
> > Michal Frajt
> > Luxonit
> >
> >
> >




More information about the zgc-dev mailing list