Code Review for WeakReference leak in the Logging API (6942989)
Daniel D. Daugherty
daniel.daugherty at oracle.com
Mon Jun 14 09:26:20 PDT 2010
Eamonn,
Thanks for the review! I didn't know about ReferenceQueues so that's
a very interesting idea. I guess I need to get out of the VM codebase
a little more often... :-)
Jumping ahead in the e-mail thread, Jeremy Manson from Google has
offered the use of their fix for the problem. It looks like it uses
ReferenceQueues so I'll be taking a closer look at that today also.
More below...
On 6/11/2010 10:36 AM, Eamonn McManus wrote:
> I think an alternative approach to the one here would be to use a
> global ReferenceQueue and a subclass of WeakReference that has a
> pointer back to the Logger or LogNode that contains this
> WeakReference. Then, in the cases where you currently check for stale
> entries, you could simply poll the ReferenceQueue and visit any Logger
> or LogNode that it is telling you has a cleared WeakReference.
ReferenceQueue is a very cool class.
> There are a couple of subtle points, though. First, you'll need to be
> careful about synchronization before clearing WeakReferences, since
> the Logger you're being told about isn't necessarily related to the
> current one. Logger.kids is protected by a global lock treeLock but
> LogManager.loggers is protected by the LogManager's lock, so when
> logManager1.addLogger detects that a logger belonging to logManager2
> has a stale WeakReference, it needs to drop the logManager1 lock and
> acquire the logManager2 one before scanning the reference list. (This
> means that addLogger has to change from being a synchronized method to
> have a big synchronized(this) block with a non-synchronized tail-end.)
So you're proposing using ReferenceQueues for all of the stuff
that uses WeakReferences in the Logging API and not just for the
two leaks that I've found. Consistency is a good idea so I'll
keep that in mind also.
> Second, since apparently the user can introduce loops in the Logger
> hierarchy, the reference from the WeakReference subclass back to the
> parent Logger will itself need to be a WeakReference.
This subtle point is a little too subtle for my brain right now, but
I'll try to keep it in mind also. Since I've got an VM/NSK test that
blows up when the loops aren't accounted for, I'll have a reminder.
> Nevertheless, I think this approach is likely to be less complicated
> than the proposed one, notably because there's no need to protect
> against infinite recursion; and the performance concerns should be
> less because you only look for cleared WeakReferences when you know
> they are there, and you know where to look.
I don't think I'll be able to convince anyone that a fix using
ReferenceQueues is less complicated. The fix I wrote is a pretty
simple, big hammer approach. ReferenceQueues, by your own proposal,
introduce subtleties that are less than obvious.
However, your proposal to use ReferenceQueues is a good idea and I'll
investigate it today. It helps that Jeremy Manson's fix also uses
ReferenceQueues.
Dan
> Éamonn
>
> Alan Bateman wrote:
>> Daniel D. Daugherty wrote:
>>> :
>>>
>>> Here is the URL for the webrev:
>>>
>>> http://cr.openjdk.java.net/~dcubed/6942989-webrev/0/
>>>
>> I went through the changes too.
>>
>> I agree with the performance concern with expunging stale entries
>> each time a LogManager is created. Would it make sense to do this
>> once every N times rather than every time? Or maybe keep a timestamp
>> so that you limit the number of times your attempt to expunge in some
>> time period? (say a maximum of once a minute).
>>
>> Does LogManager.deleteStaleWeakRefs need to be synchronized with any
>> other operations?
>>
>> I wonder if Logger.kidsCleanupMarker is the best name for this [ kids
>> and markers usually involve clean-up but that's not what is meant
>> here :-) ]. Maybe rename to something that includes the word
>> "sentinel"?
>>
>> Is is necessary to use jmap in the regression test? Would it be
>> simpler to have a pure java test that runs with a small heap size? If
>> it leaks then it fails with OOME.
>>
>> -Alan.
>>
>>
>>
>>
More information about the serviceability-dev
mailing list