Second Code Review for WeakReference leak in the Logging API (6942989)

Daniel D. Daugherty daniel.daugherty at oracle.com
Tue Jun 29 09:10:50 PDT 2010


Jeremy,

Closing the loop on this part of the thread. I don't think there are
any more leaks left after the fix is applied. Here are the entries
I added the "public comments" section of the bug. For some reason
that bug is _still_ not showing up on the OpenJDK site.

> === *Public Comments* ========================================================
> I ran the AnonLoggerWeakRefLeak test with the following env variables:
>
> TESTJAVA=c:/work/shared/mirrors/jdks-win32/jdk/1.7.0/latest/binaries/windows-i586
> TESTCLASSES=C:/cygwin/home/dcubed/Projects/6942989/XXX
> TESTVMOPTS=-showversion -client -Xmx32m
> TESTSRC=C:/cygwin/home/dcubed/Projects/6942989/XXX
>
> along with a 30 minute duration (1800 seconds) using JDK7-B99 bits
> (which does NOT have the fix).
>
> Here's the last 15 lines of the log files:
>
> $ tail -15 AnonLogger.jdk7-B99_30min.log
> INFO: instance_cnt = 1032505
> INFO: instance_cnt = 1038305
> INFO: instance_cnt = 1044005
> INFO: instance_cnt = 1049705
> INFO: jmap exited with exit code = 1
> INFO: The likely reason is that AnonLoggerWeakRefLeak has finished running.
> INFO: increasing_cnt = 181
> INFO: decreasing_cnt = 0
> INFO: The instance count of java.lang.ref.WeakReference objects
> INFO: is always increasing.
> FAIL: This indicates that there is a memory leak.
>
> real    18m10.296s
> user    0m41.570s
> sys     0m46.442s
>
> $ tail -15 AnonLoggerWeakRefLeak.jdk7-B99_30min.log
> INFO: call count = 1044000
> INFO: call count = 1045000
> INFO: call count = 1046000
> INFO: call count = 1047000
> INFO: call count = 1048000
> INFO: call count = 1049000
> Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
>         at java.util.Arrays.copyOf(Arrays.java:2239)
>         at java.util.Arrays.copyOf(Arrays.java:2213)
>         at java.util.ArrayList.grow(ArrayList.java:208)
>         at java.util.ArrayList.ensureCapacity(ArrayList.java:182)
>         at java.util.ArrayList.add(ArrayList.java:406)
>         at java.util.logging.Logger.doSetParent(Logger.java:1401)
>         at java.util.logging.Logger.getAnonymousLogger(Logger.java:374)
>         at AnonLoggerWeakRefLeak.main(AnonLoggerWeakRefLeak.java:60)
>
>
> So it took a little over 18 minutes for a Client VM with a 32MB
> max heap size to throw an OOME with AnonLoggerWeakRefLeak.
>
>
> I ran the LoggerWeakRefLeak test with the same environment variables
> along with a 30 minute duration (1800 seconds) using JDK7-B99 bits
> (which does NOT have the fix).
>
> Here's the last 15 lines of the log files:
>
> $ tail -15 Logger.jdk7-B99_30min.log
> INFO: instance_cnt = 935705
> INFO: instance_cnt = 938605
> INFO: instance_cnt = 941205
> INFO: instance_cnt = 943205
> INFO: jmap exited with exit code = 1
> INFO: The likely reason is that LoggerWeakRefLeak has finished running.
> INFO: increasing_cnt = 160
> INFO: decreasing_cnt = 0
> INFO: The instance count of java.lang.ref.WeakReference objects
> INFO: is always increasing.
> FAIL: This indicates that there is a memory leak.
>
> real    16m47.690s
> user    0m37.910s
> sys     0m40.002s
>
> $ tail -15 LoggerWeakRefLeak.jdk7-B99_30min.log
> INFO: call count = 881000
> INFO: call count = 882000
> INFO: call count = 883000
> INFO: call count = 884000
> INFO: call count = 885000
> INFO: call count = 886000
> INFO: call count = 887000
> INFO: call count = 888000
> INFO: call count = 889000
> INFO: call count = 890000
> INFO: call count = 891000
> INFO: call count = 892000
> INFO: call count = 893000
> INFO: call count = 894000
> Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
>
>
> So it took a little less than 17 minutes for a Client VM with a 32MB
> max heap size to throw an OOME with LoggerWeakRefLeak.
>
> *** (#1 of 3): 2010-06-28 22:45:03 GMT+00:00 daniel.daugherty at oracle.com
> *** Last Edit: 2010-06-28 22:57:39 GMT+00:00 daniel.daugherty at oracle.com
>
> I ran the AnonLoggerWeakRefLeak test with the same environment
> variables for a 60 minute duration using the fixed bits from
> JPRT. Here are the last 15 lines of each log (much less
> interesting info):
>
> $ tail -15 AnonLogger.jdk7-B100+_60min.log
> INFO: instance_cnt = 5801
> INFO: instance_cnt = 5901
> INFO: instance_cnt = 5701
> INFO: instance_cnt = 5601
> INFO: jmap exited with exit code = 1
> INFO: The likely reason is that AnonLoggerWeakRefLeak has finished running.
> INFO: increasing_cnt = 227
> INFO: decreasing_cnt = 383
> INFO: The instance count of java.lang.ref.WeakReference objects
> INFO: is both increasing and decreasing.
> PASS: This indicates that there is not a memory leak.
>
> real    60m5.099s
> user    2m18.882s
> sys     2m31.127s
>
> $ tail -15 AnonLoggerWeakRefLeak.jdk7-B100+_60min.log
> INFO: call count = 3484000
> INFO: call count = 3485000
> INFO: call count = 3486000
> INFO: call count = 3487000
> INFO: call count = 3488000
> INFO: call count = 3489000
> INFO: call count = 3490000
> INFO: call count = 3491000
> INFO: call count = 3492000
> INFO: call count = 3493000
> INFO: call count = 3494000
> INFO: call count = 3495000
> INFO: call count = 3496000
> INFO: call count = 3497000
> INFO: final loop count = 3497200
>
>
> I ran the LoggerWeakRefLeak test with the same environment
> variables for a 60 minute duration using the fixed bits from
> JPRT. Here are the last 15 lines of each log (much less
> interesting info):
>
> $ tail -15 Logger.jdk7-B100+_60min.log
> INFO: instance_cnt = 1813
> INFO: instance_cnt = 1638
> INFO: instance_cnt = 1514
> INFO: instance_cnt = 1511
> INFO: jmap exited with exit code = 1
> INFO: The likely reason is that LoggerWeakRefLeak has finished running.
> INFO: increasing_cnt = 293
> INFO: decreasing_cnt = 318
> INFO: The instance count of java.lang.ref.WeakReference objects
> INFO: is both increasing and decreasing.
> PASS: This indicates that there is not a memory leak.
>
> real    60m6.783s
> user    2m20.592s
> sys     2m31.997s
>
> $ tail -15 LoggerWeakRefLeak.jdk7-B100+_60min.log
> INFO: call count = 3502000
> INFO: call count = 3503000
> INFO: call count = 3504000
> INFO: call count = 3505000
> INFO: call count = 3506000
> INFO: call count = 3507000
> INFO: call count = 3508000
> INFO: call count = 3509000
> INFO: call count = 3510000
> INFO: call count = 3511000
> INFO: call count = 3512000
> INFO: call count = 3513000
> INFO: call count = 3514000
> INFO: call count = 3515000
> INFO: final loop count = 3515800
>
> *** (#2 of 3): 2010-06-29 01:46:49 GMT+00:00 daniel.daugherty at oracle.com
>
> I ran the AnonLoggerWeakRefLeak test with the same environment
> variables for a 12 hour duration using the fixed bits from JPRT.
> For this run I saved the "Total" line from the jmap output from
> every 10th sample:
>
> $ diff ../AnonLoggerWeakRefLeak.sh AnonLoggerWeakRefLeak.sh
> 161a162
>   
>>> > > "$TEST_NAME.totals"
>>>       
> 225a227,233
>   
>> > 
>> > set +e
>> > mod=`expr "$loop_cnt" % 10`
>> > set -e
>> > if [ "$mod" = 0 ]; then
>> > tail -1 "$TEST_NAME.jmap" >> "$TEST_NAME.totals"
>> > fi
>>     
>
> Here is an analysis of the .totals data:
>
> $ sh analyze_totals < AnonLoggerWeakRefLeak.totals
>           #objs      #bytes
> first:    30537     2243648
> lo:       30537     2243648
> hi:       57072     3197904
> last:     35676     2882488
> avg:      36853     2929982
> # samples: 647
>
> The first sample is also the lowest set of values which isn't
> a surprise given that the first sample is taken shortly after
> the target program has started running. The hi value occurred
> in sample #269 of 647 and the last sample was below the average.
> This data indicates that the values are both rising and falling
> over time which does not indicate any more memory leaks.
>
>
>
> I also ran the LoggerWeakRefLeak test with the same environment
> variables for a 12 hour duration using the fixed bits from JPRT.
> For this run I saved the "Total" line from the jmap output from
> every 10th sample:
>
> $ diff ../LoggerWeakRefLeak.sh LoggerWeakRefLeak.sh
> 161a162
>   
>>> > > "$TEST_NAME.totals"
>>>       
> 225a227,233
>   
>> > 
>> > set +e
>> > mod=`expr "$loop_cnt" % 10`
>> > set -e
>> > if [ "$mod" = 0 ]; then
>> > tail -1 "$TEST_NAME.jmap" >> "$TEST_NAME.totals"
>> > fi
>>     
>
> Here is an analysis of the .totals data:
>
> $ sh analyze_totals < LoggerWeakRefLeak.totals
>           #objs      #bytes
> first:    48957     2832648
> lo:       48957     2832648
> hi:      148299     6299016
> last:    133173     5451032
> avg:     137578     5608654
> # samples: 647
>
> The first sample is also the lowest set of values which isn't
> a surprise given that the first sample is taken shortly after
> the target program has started running. The hi value occurred
> in sample #76 of 647 and the last sample was below the average.
> This data indicates that the values are both rising and falling
> over time which does not indicate any more memory leaks.
>
> *** (#3 of 3): 2010-06-29 15:59:39 GMT+00:00 daniel.daugherty at oracle.com


Please let me know if the fix is not working for you in your
environment at Google.

Dan


On 6/23/2010 6:38 PM, Daniel D. Daugherty wrote:
> On 6/23/2010 12:38 PM, Jeremy Manson wrote:
>> Hi Daniel,
>>
>> I'm sorry I missed this (I heavily filter these lists, and check 
>> rarely).
>>   
>
> This time I specifically left you on the "To" list rather than
> editing down to just the list aliases.
>
>
>> My main feeling is that you are missing a good bet by not
>> reconstructing the Hashtable in LogManager and the ArrayList in Logger
>> every so often when you remove the loggers.  In a test case where
>> there are a LOT of short-lived loggers, the backing array for the
>> Hashtable can get very big.  It is permanent, and doesn't go anywhere.
>>  You can end up with a lot of extra memory lying around that way.
>>   
>
> It's possible that is a good bet. However, that wasn't mentioned
> as one of the issues in either bug report (I think) so I didn't
> write a test for that.
>
>
>> Specifically, when I didn't reconstruct those data structures, the
>> test case listed in the bug (where it just creates lots and lots of
>> anonymous loggers) killed the Java instance with an OOM, even if I
>> *did* clean up the weakreferences to the loggers.
>>   
>
> I'll create a variant of the anon logger test that I put in the
> changeset and checkout if I can kill the Java instance with an OOM.
> I'll keep you posted.
>
>
>> I'm assuming you have a customer waiting for this - if that is similar
>> to their usage pattern, this fix may not fix their problem.
>>   
>
> Thanks for the heads up. The JDK6 version of the fix hasn't been tested
> by the customer yet so you might be right.
>
>
>> You obviously don't want to rebuild those structures every time,
>> though.  What I did in my change was to reconstruct the backing data
>> structures every time ~as many loggers were collected as were present
>> in the data structure.
>>   
>
> Yup. I caught that part of the rebuild algorithm. It's just that the
> reason for doing the rebuild didn't jump out at me.
>
> Dan
>
>
>> Jeremy
>>
>> On Fri, Jun 18, 2010 at 12:25 PM, Daniel D. Daugherty
>> <daniel.daugherty at oracle.com> wrote:
>>  
>>> Greetings,
>>>
>>> I have a new version of my fix for the WeakReference leak in the
>>> Logging API done. This version uses ReferenceQueues; thanks to Eamonn
>>> McManus, Jeremy Manson and Tony Printezis for their insights on using
>>> ReferenceQueues. Here's a pointer to Tony's paper for background info:
>>>
>>>    http://java.sun.com/developer/technicalArticles/javase/finalization/
>>>
>>> This version also has limits on the number of dead Loggers that are
>>> cleaned up per call; thanks to Alan Bateman for politely pushing me in
>>> that direction.
>>>
>>> The webrev is again relative to OpenJDK7, but the bug is escalated so
>>> the fix will be backported to the JDK6-Update train. So again, I'll
>>> need a minimum of two code reviewers.
>>>
>>> Here is the URL for the webrev:
>>>
>>>    http://cr.openjdk.java.net/~dcubed/6942989-webrev/1/
>>>
>>> Thanks, in advance, for any reviews.
>>>
>>> Dan
>>>
>>>
>>>     
>


More information about the hotspot-runtime-dev mailing list