Benefits of activeReferenceQueue was: ReferenceQueue.remove to allow GC of the queue itself

Tue Jul 29 08:16:24 UTC 2014

Hi Peter,

Thanks for the detailed explanation. It is an interesting problem. But 
at this stage my only conclusion is that we are nowhere near the point 
of deciding to add a new method to a public API.

I think the fundamental flaw of activeReferenceQueue is in trying to 
hide the thread management from the user. By automating the threading 
there is no way to control the lifecycle of the thread and hence we have 
the problem at hand. When the application code manages the thread 
itself, then it can also manage its lifecycle and avoid the problem.

This might be a case where some kind of user-level "reference counting" 
would be a better solution. But of course that would require changes to 
all existing users of the class.

David
-----

On 29/07/2014 6:03 PM, Peter Levart wrote:
>
> On 07/29/2014 04:16 AM, David Holmes wrote:
>> Hi Jaroslav,
>>
>> So ... activeReferenceQueue is a reference queue that embodies a
>> thread that does the polling and implements a psuedo-finalization
>> mechanism. This works fine in the normal case where the lifetime of
>> the queue is the lifetime of the "application". In the WAR case (and I
>> don't know the details of WAR deployment) each time it is deployed in
>> the same VM we get a new activeReferenceQueue and a new thread.
>>
>> The basic issue is that the thread has a strong reference to the queue
>> and has no idea when it should exit and so the thread and queue remain
>> forever even if there is no user code with a reference to the queue -
>> does that sum it up?
>>
>> Can the thread not hold a weakreference to the queue and poll using
>> remove(timeout) and then terminate when the queue reference is gone?
>>
>> Thanks,
>> David
>
> The main problem I think is, that when calling queue.remove(timeout) the
> thread *does* hold a strong reference to the queue for the entire time
> it waits for something to be enqueued. So majority of time in a loop,
> thread holds a strong reference to queue preventing it from being
> considered for WeakReference processing or at least prolonging it's life
> into an indefinite future (the window of opportunity for queue to be
> found weakly reachable is very small).
>
> That's why Jaroslav is hacking with reflection to get hold of the 'lock'
> so that he can re-implement the ReferenceQueue.remove(timeout) method
> without holding a strong reference to the queue. If we wanted to make
> Jaroslav's life easier, a method like the following in ReferenceQueue
> could help him:
>
>
> public class ReferenceQueue {
>      ...
>
>      public static class CollectedException extends Exception {}
>
>      public static <T> Reference<? extends T> remove(
>          Reference<? extends ReferenceQueue<T>> queueRef,
>          long timeout
>      ) throws IllegalArgumentException, InterruptedException,
> CollectedException {
>
>          if (timeout < 0) {
>              throw new IllegalArgumentException("Negative timeout value");
>          }
>          // obtain lock
>          Object lock = apply(queueRef, queue -> queue.lock);
>
>          synchronized (lock) {
>              Reference<? extends T> r = apply(queueRef,
> ReferenceQueue<T>::reallyPoll);
>              if (r != null) return r;
>              long start = (timeout == 0) ? 0 : System.nanoTime();
>              for (; ; ) {
>                  lock.wait(timeout);
>                  r = apply(queueRef, ReferenceQueue<T>::reallyPoll);
>                  if (r != null) return r;
>                  if (timeout != 0) {
>                      long end = System.nanoTime();
>                      timeout -= (end - start) / 1000_000;
>                      if (timeout <= 0) return null;
>                      start = end;
>                  }
>              }
>          }
>
>      }
>
>      private static <R, T> R apply(
>          Reference<? extends ReferenceQueue<T>> queueRef,
>          Function<ReferenceQueue<T>, R> func
>      ) throws CollectedException {
>          ReferenceQueue<T> queue = queueRef.get();
>          if (queue == null) throw new CollectedException();
>          return func.apply(queue);
>      }
>
>      ...
>
>
> This is basically a re-implementation of ReferenceQueue.remove(int
> timeout) instance method in terms of a static method, which only briefly
> "touches" the queue instance wrapped in a Reference but majority of time
> it waits for notification while only holding a Reference to the queue.
>
> But that's only half of the story. The other half is how a thread finds
> out that the queue has been collected so it can exit. One way is
> polling, which is what Jaroslav would like to avoid. Another way is
> wraping the queue with a WeakReference and waiting for it to be enqueued
> in some other queue which can be monitored by another thread.
>
> I tried an experiment with re-using the queue for which we would like to
> be notified about it's collection as the target for a WeakReference
> wrapping this same queue. This doesn't work of course, since it assumes
> that a WeakReference will be enqueued into the queue that's already
> gone. WeakReference does get cleared, but no equeue-ing happens and
> hence no notification gets delivered.
>
> Regards, Peter
>
>>
>> On 28/07/2014 11:06 PM, Jaroslav Tulach wrote:
>>> Hello David,
>>>
>>> thanks for being patient with me. I'll do my best to describe the
>>> original context.
>>>
>>> Dne Po 28. července 2014 21:07:45, David Holmes napsal(a):
>>>
>>>  > I read the issue and still did not understand the nature of the
>>> problem.
>>>
>>>  > The netbeans bugs also did not shed any light on things for me.
>>> What is
>>>
>>>  > the functionality of the activeReferenceQueue
>>>
>>> The functionality of the active reference queue is described in NetBeans
>>> APIs[1]. I think the best way to describe it in context of existing JDK
>>> APIs, is to call it "lightweight finalizer without classical finalizer
>>> problems". To quote the Javadoc:
>>>
>>> ---
>>>
>>> If you have a reference that needs cleanup, make it implement Runnable
>>> and register it with the queue:
>>>
>>> class MyReference extends WeakReference implements Runnable {
>>>
>>> private final OtherInfo dataToCleanUp;
>>>
>>> public MyReference(Thing ref, OtherInfo data) {
>>>
>>> super(ref, Utilities.activeReferenceQueue());
>>>
>>> dataToCleanUp = data;
>>>
>>> }
>>>
>>> public void run() {
>>>
>>> dataToCleanUp.releaseOrWhateverYouNeed();
>>>
>>> }
>>>
>>> }
>>>
>>> When the ref object is garbage collected, your run method will be
>>> invoked by calling ((Runnable) reference).run()
>>>
>>> --
>>>
>>> The benefit taken from "finalizer" is that one does not need to start
>>> own thread. The difference to "finalizer" is that the object is already
>>> gone, e.g. no chance to re-activate it again.
>>>
>>> We introduced the activeReferenceQueue API when we realized that many
>>> modules over the code base start their own thread and try to do the
>>> classical poll() cleanup. Once upon a time we used to have more than
>>> twenty threads like this, and as overhead of a thread is not low, we
>>> improved the NetBeans memory consumption quite a lot by introducing the
>>> activeReferenceQueue.
>>>
>>>  > and what it is that there
>>>
>>>  > are problems with?
>>>
>>> None in case of NetBeans. Once the activeReferenceQueue initializes
>>> itself and its thread, it runs up until the termination of the system
>>> and works great.
>>>
>>> However NetBeans APIs can be used outside of NetBeans runtime container
>>> and, when used in a WAR file, people started to get problems during
>>> re-deploys.
>>>
>>>> Once we got a bug report[2] that it behaves poorly
>>>
>>>> when used inside of a WAR file. Whenever the WAR was redeployed, the
>>>> number
>>>
>>>> of cleanup threads increased by one, which also caused major memory
>>>> leaks.
>>>
>>> Those problems could be fixed by using active polling as I wrote in
>>> today's morning email:
>>>
>>>  > class Impl extends ReferenceQueue {}
>>>
>>>  > Reference<Impl> ref = new WeakReference<Impl>(new Impl());
>>>
>>>  >
>>>
>>>  > while (true) {
>>>
>>>  >
>>>
>>>  > Impl impl = ref.get();
>>>
>>>  > if (impl == null) {
>>>
>>>  >
>>>
>>>  > // no other Reference objects using the Impl queue.
>>>
>>>  > // exit this cleaner thread
>>>
>>>  > return;
>>>
>>>  >
>>>
>>>  > }
>>>
>>>  > Reference<?> ref = impl.remove(15000);
>>>
>>>  > if (ref == null) {
>>>
>>>  >
>>>
>>>  > impl = null; // don't hold strong ref to Impl queue
>>>
>>>  > System.gc(); // XXX: is anyone else holding reference to Impl queue?
>>>
>>>  > continue;
>>>
>>>  >
>>>
>>>  > }
>>>
>>>  > // do something with ref
>>>
>>>  >
>>>
>>>  > }
>>>
>>>  >
>>>
>>>  > this could work, althrough the problem is the XXX part.
>>>
>>>  >
>>>
>>>  > I need to release my own pointer to the Impl queue, tell the system
>>> to try
>>>
>>>  > to garbage collect it. If it has not been removed, grap new strong
>>> pointer
>>>
>>>  > to the Impl queue and wait again. I am not aware of any other way
>>> to ask
>>>
>>>  > for GC than System.gc, and having System.gc being called every 15s
>>> will
>>>
>>>  > likely decrease VM performance a bit.
>>>
>>>  >
>>>
>>>  > The proper solution (no reflection, no repeated polling) would in
>>> fact be
>>>
>>>  > simple: Being able to call:
>>>
>>>  >
>>>
>>>  > impl.remove();
>>>
>>>  >
>>>
>>>  > without anyone having strong reference to impl - e.g. without impl
>>> being on
>>>
>>>  > stack during the remove call.
>>>
>>> I don't know what else to add. So I wait for further question.
>>>
>>> -jt
>>>
>>> [1]
>>> http://bits.netbeans.org/dev/javadoc/org-openide-util/org/openide/util/Utilities.html#activeReferenceQueue()
>>>
>>>
>>> [2] https://netbeans.org/bugzilla/show_bug.cgi?id=206621
>>>
>