Code Review for WeakReference leak in the Logging API (6942989)

Daniel D. Daugherty daniel.daugherty at oracle.com
Mon Jun 14 09:26:20 PDT 2010


Eamonn,

Thanks for the review! I didn't know about ReferenceQueues so that's
a very interesting idea. I guess I need to get out of the VM codebase
a little more often... :-)

Jumping ahead in the e-mail thread, Jeremy Manson from Google has
offered the use of their fix for the problem. It looks like it uses
ReferenceQueues so I'll be taking a closer look at that today also.

More below...

On 6/11/2010 10:36 AM, Eamonn McManus wrote:
> I think an alternative approach to the one here would be to use a 
> global ReferenceQueue and a subclass of WeakReference that has a 
> pointer back to the Logger or LogNode that contains this 
> WeakReference. Then, in the cases where you currently check for stale 
> entries, you could simply poll the ReferenceQueue and visit any Logger 
> or LogNode that it is telling you has a cleared WeakReference.

ReferenceQueue is a very cool class.


> There are a couple of subtle points, though. First, you'll need to be 
> careful about synchronization before clearing WeakReferences, since 
> the Logger you're being told about isn't necessarily related to the 
> current one. Logger.kids is protected by a global lock treeLock but 
> LogManager.loggers is protected by the LogManager's lock, so when 
> logManager1.addLogger detects that a logger belonging to logManager2 
> has a stale WeakReference, it needs to drop the logManager1 lock and 
> acquire the logManager2 one before scanning the reference list. (This 
> means that addLogger has to change from being a synchronized method to 
> have a big synchronized(this) block with a non-synchronized tail-end.)

So you're proposing using ReferenceQueues for all of the stuff
that uses WeakReferences in the Logging API and not just for the
two leaks that I've found. Consistency is a good idea so I'll
keep that in mind also.


> Second, since apparently the user can introduce loops in the Logger 
> hierarchy, the reference from the WeakReference subclass back to the 
> parent Logger will itself need to be a WeakReference.

This subtle point is a little too subtle for my brain right now, but
I'll try to keep it in mind also. Since I've got an VM/NSK test that
blows up when the loops aren't accounted for, I'll have a reminder.


> Nevertheless, I think this approach is likely to be less complicated 
> than the proposed one, notably because there's no need to protect 
> against infinite recursion; and the performance concerns should be 
> less because you only look for cleared WeakReferences when you know 
> they are there, and you know where to look.

I don't think I'll be able to convince anyone that a fix using
ReferenceQueues is less complicated. The fix I wrote is a pretty
simple, big hammer approach. ReferenceQueues, by your own proposal,
introduce subtleties that are less than obvious.

However, your proposal to use ReferenceQueues is a good idea and I'll
investigate it today. It helps that Jeremy Manson's fix also uses
ReferenceQueues.

Dan



> Éamonn
>
> Alan Bateman wrote:
>> Daniel D. Daugherty wrote:
>>> :
>>>
>>> Here is the URL for the webrev:
>>>
>>>    http://cr.openjdk.java.net/~dcubed/6942989-webrev/0/
>>>
>> I went through the changes too.
>>
>> I agree with the performance concern with expunging stale entries 
>> each time a LogManager is created. Would it make sense to do this 
>> once every N times rather than every time? Or maybe keep a timestamp 
>> so that you limit the number of times your attempt to expunge in some 
>> time period? (say a maximum of once a minute).
>>
>> Does LogManager.deleteStaleWeakRefs need to be synchronized with any 
>> other operations?
>>
>> I wonder if Logger.kidsCleanupMarker is the best name for this [ kids 
>> and markers usually involve clean-up but that's not what is meant 
>> here :-)  ]. Maybe rename to something that includes the word 
>> "sentinel"?
>>
>> Is is necessary to use jmap in the regression test? Would it be 
>> simpler to have a pure java test that runs with a small heap size? If 
>> it leaks then it fails with OOME.
>>
>> -Alan.
>>
>>
>>
>>


More information about the hotspot-runtime-dev mailing list