Full GC run freeing no memory despite many unreachable objects

Jon Masamitsu jon.masamitsu at oracle.com
Thu Nov 11 14:48:09 UTC 2010


Darragh,

The GC thinks that all the data in the heap is live
so that is why no space is being recovered
at a full GC.  There is no time limit on the collections
(for this collector).  The full collection does complete.
I don't know why the tools you've used say the objects
are unreachable.  My tendency is, of course, to believe
the GC before the tools :-).

Jon

On 11/11/10 03:50, Darragh Curran wrote:
> Thanks Ramki,
>
> I know what I really need to do is find how to consistently reproduce
> this, then look at ways to either fix a problem with our code, tune gc
> settings or (least likely) discover a bug in hotspot.
>
> Before I do that, I was wondering if anyone had insight into what
> would cause the expensive full gc runs consuming 100% CPU and lasting
> for ~10 seconds without freeing any memory, despite there being many
> unreachable objects (according to the heap dump).
>
> Is there a cap on the length of a full GC run? Perhaps it's taking all
> it's time to traverse the heap and gets cancelled before it does any
> collecting?
>
> Best regards,
> Darragh
>
>
> On Wed, Nov 10, 2010 at 6:03 PM, Y. S. Ramakrishna
> <y.s.ramakrishna at oracle.com> wrote:
>   
>> Have you tried say doubling the size of your heap (old gen and
>> perm gen) to see what happens to the problem? May be you are using a
>> heap that is much too small for your application's natural footprint on
>> the platform on which you are running it? If you believe the process
>> heap needs should not be so large, check what the heap contains --
>> does a jmap -histo:live show many objects as live that you believe
>> should have been collected? Have you tried to determine, using a
>> tool such as jhat on these jmap heap dumps, to see if your application
>> has a real leak after all?
>>
>> If you have determined that you do not have a leak, can you share
>> a test case? (Also please first try the latest 6u23 JDK before
>> you try to design a test case.)
>>
>> best regards.
>> -- ramki
>>
>>
>> On 11/10/10 09:22, Darragh Curran wrote:
>>     
>>> Ooops, Here's the jstat output with headings:
>>>
>>> S0C    S1C    S0U    S1U      EC       EU        OC         OU
>>> PC     PU    YGC     YGCT    FGC    FGCT     GCT
>>> 14400.0 13824.0  0.0    0.0   517888.0 517888.0 1092288.0  1060013.9
>>> 37376.0 37182.0  38654 1288.741 1105  5636.779 6925.520
>>> 14400.0 13824.0  0.0    0.0   517888.0 517888.0 1092288.0  1060013.9
>>> 37376.0 37182.0  38654 1288.741 1105  5636.779 6925.520
>>> 14400.0 13824.0  0.0    0.0   517888.0 517888.0 1092288.0  1060013.9
>>> 37376.0 37182.0  38654 1288.741 1105  5636.779 6925.520
>>> 14400.0 13824.0  0.0    0.0   517888.0 517888.0 1092288.0  1060013.9
>>> 37376.0 37182.0  38654 1288.741 1105  5636.779 6925.520
>>> 14400.0 13824.0  0.0    0.0   517888.0 517888.0 1092288.0  1060013.9
>>> 37376.0 37182.0  38654 1288.741 1105  5636.779 6925.520
>>> 14400.0 13824.0  0.0    0.0   517888.0 517888.0 1092288.0  1060013.9
>>> 37376.0 37182.0  38654 1288.741 1105  5636.779 6925.520
>>> 14400.0 13824.0  0.0    0.0   517888.0 517888.0 1092288.0  1060013.9
>>> 37376.0 37182.0  38654 1288.741 1105  5636.779 6925.520
>>> 14400.0 13824.0  0.0    0.0   517888.0 517888.0 1092288.0  1060013.9
>>> 37376.0 37182.0  38654 1288.741 1105  5636.779 6925.520
>>> 14400.0 13824.0  0.0    0.0   517888.0 517888.0 1092288.0  1060014.0
>>> 37376.0 37182.0  38654 1288.741 1106  5646.857 6935.598
>>> 14400.0 13824.0  0.0    0.0   517888.0 517888.0 1092288.0  1060014.0
>>> 37376.0 37182.0  38654 1288.741 1106  5646.857 6935.598
>>> 14400.0 13824.0  0.0    0.0   517888.0 517888.0 1092288.0  1060014.0
>>> 37376.0 37182.0  38654 1288.741 1106  5646.857 6935.598
>>> 14400.0 13824.0  0.0    0.0   517888.0 517888.0 1092288.0  1060014.0
>>> 37376.0 37182.0  38654 1288.741 1106  5646.857 6935.598
>>> 14400.0 13824.0  0.0    0.0   517888.0 517888.0 1092288.0  1060014.0
>>> 37376.0 37182.0  38654 1288.741 1106  5646.857 6935.598
>>> 14400.0 13824.0  0.0    0.0   517888.0 517888.0 1092288.0  1060014.0
>>> 37376.0 37182.0  38654 1288.741 1106  5646.857 6935.598
>>> 14400.0 13824.0  0.0    0.0   517888.0 517888.0 1092288.0  1060014.0
>>> 37376.0 37182.0  38654 1288.741 1106  5646.857 6935.598
>>> 14400.0 13824.0  0.0    0.0   517888.0 517888.0 1092288.0  1060014.0
>>> 37376.0 37182.0  38654 1288.741 1106  5646.857 6935.598
>>> 14400.0 13824.0  0.0    0.0   517888.0 517888.0 1092288.0  1060014.0
>>> 37376.0 37182.0  38654 1288.741 1106  5646.857 6935.598
>>> 14400.0 13824.0  0.0    0.0   517888.0 517888.0 1092288.0  1060014.0
>>> 37376.0 37182.0  38654 1288.741 1106  5646.857 6935.598
>>> ...
>>> ...
>>> ...
>>> 14400.0 13824.0  0.0    0.0   517888.0 517888.0 1092288.0  1060014.1
>>> 37376.0 37182.0  38654 1288.741 1107  5656.903 6945.644
>>> 14400.0 13824.0  0.0    0.0   517888.0 517888.0 1092288.0  1060014.1
>>> 37376.0 37182.0  38654 1288.741 1107  5656.903 6945.644
>>> 14400.0 13824.0  0.0    0.0   517888.0 517888.0 1092288.0  1060014.1
>>> 37376.0 37182.0  38654 1288.741 1108  5666.547 6955.287
>>> 14400.0 13824.0  0.0    0.0   517888.0 517888.0 1092288.0  1060014.1
>>> 37376.0 37182.0  38654 1288.741 1108  5666.547 6955.287
>>> 14400.0 13824.0  0.0    0.0   517888.0 517888.0 1092288.0  1060014.1
>>> 37376.0 37182.0  38654 1288.741 1108  5666.547 6955.287
>>> 14400.0 13824.0  0.0    0.0   517888.0 517888.0 1092288.0  1060014.1
>>> 37376.0 37182.0  38654 1288.741 1108  5666.547 6955.287
>>> 14400.0 13824.0  0.0    0.0   517888.0 517888.0 1092288.0  1060014.1
>>> 37376.0 37182.0  38654 1288.741 1108  5666.547 6955.287
>>> 14400.0 13824.0  0.0    0.0   517888.0 517888.0 1092288.0  1060014.1
>>> 37376.0 37182.0  38654 1288.741 1108  5666.547 6955.287
>>> 14400.0 13824.0  0.0    0.0   517888.0 517888.0 1092288.0  1060014.1
>>> 37376.0 37182.0  38654 1288.741 1108  5666.547 6955.287
>>> 14400.0 13824.0  0.0    0.0   517888.0 517888.0 1092288.0  1060014.1
>>> 37376.0 37182.0  38654 1288.741 1108  5666.547 6955.287
>>> 14400.0 13824.0  0.0    0.0   517888.0 517888.0 1092288.0  1060014.1
>>> 37376.0 37182.0  38654 1288.741 1108  5666.547 6955.287
>>> 14400.0 13824.0  0.0    0.0   517888.0 517888.0 1092288.0  1060014.1
>>> 37376.0 37182.0  38654 1288.741 1109  5675.945 6964.686
>>> 14400.0 13824.0  0.0    0.0   517888.0 517888.0 1092288.0  1060014.1
>>> 37376.0 37182.0  38654 1288.741 1109  5675.945 6964.686
>>> 14400.0 13824.0  0.0    0.0   517888.0 517888.0 1092288.0  1060014.1
>>> 37376.0 37182.0  38654 1288.741 1109  5675.945 6964.686
>>> 14400.0 13824.0  0.0    0.0   517888.0 517888.0 1092288.0  1060014.1
>>> 37376.0 37182.0  38654 1288.741 1109  5675.945 6964.686
>>> 14400.0 13824.0  0.0    0.0   517888.0 517888.0 1092288.0  1060014.1
>>> 37376.0 37182.0  38654 1288.741 1109  5675.945 6964.686
>>> 14400.0 13824.0  0.0    0.0   517888.0 517888.0 1092288.0  1060014.1
>>> 37376.0 37182.0  38654 1288.741 1109  5675.945 6964.686
>>> 14400.0 13824.0  0.0    0.0   517888.0 517888.0 1092288.0  1060014.1
>>> 37376.0 37182.0  38654 1288.741 1109  5675.945 6964.686
>>> 14400.0 13824.0  0.0    0.0   517888.0 517888.0 1092288.0  1060014.1
>>> 37376.0 37182.0  38654 1288.741 1109  5675.945 6964.686
>>>
>>> It looks like the perm gen is full (~99%)
>>>
>>> The process doesn't recover on its own. Here's some other gc output
>>> from our logs:
>>>
>>>
>>> Heap
>>>  PSYoungGen      total 531712K, used 517887K [0x92d80000, 0xb42d0000,
>>> 0xb42d0000)
>>>  eden space 517888K, 99% used [0x92d80000,0xb273fff8,0xb2740000)
>>>  from space 13824K, 0% used [0xb2740000,0xb2740000,0xb34c0000)
>>>  to   space 14400K, 0% used [0xb34c0000,0xb34c0000,0xb42d0000)
>>>  PSOldGen        total 1092288K, used 1060042K [0x502d0000,
>>> 0x92d80000, 0x92d80000)
>>>  object space 1092288K, 97% used [0x502d0000,0x90e02a20,0x92d80000)
>>>  PSPermGen       total 37376K, used 37186K [0x4c2d0000, 0x4e750000,
>>> 0x502d0000)
>>>  object space 37376K, 99% used [0x4c2d0000,0x4e720b60,0x4e750000)
>>>
>>> Nov 9, 2010 9:29:48 AM sun.rmi.transport.tcp.TCPTransport$AcceptLoop
>>> executeAcceptLoop
>>> WARNING: RMI TCP Accept-0: accept loop for
>>> ServerSocket[addr=0.0.0.0/0.0.0.0,port=0,localport=36857] throws
>>> java.lang.OutOfMemoryError: GC overhead limit exceeded
>>>        at sun.rmi.runtime.NewThreadAction.run(NewThreadAction.java:116)
>>>        at sun.rmi.runtime.NewThreadAction.run(NewThreadAction.java:34)
>>>        at java.security.AccessController.doPrivileged(Native Method)
>>>        at
>>> sun.rmi.transport.tcp.TCPTransport$1.newThread(TCPTransport.java:94)
>>>        at
>>> java.util.concurrent.ThreadPoolExecutor.addThread(ThreadPoolExecutor.java:672)
>>>        at
>>> java.util.concurrent.ThreadPoolExecutor.addIfUnderMaximumPoolSize(ThreadPoolExecutor.java:721)
>>>        at
>>> java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:657)
>>>        at
>>> sun.rmi.transport.tcp.TCPTransport$AcceptLoop.executeAcceptLoop(TCPTransport.java:384)
>>>        at
>>> sun.rmi.transport.tcp.TCPTransport$AcceptLoop.run(TCPTransport.java:341)
>>>        at java.lang.Thread.run(Thread.java:619)
>>>
>>>       
>>>> From jmap I can see that most of the unreachable objects are
>>>>         
>>> LinkedHashMap.Entry objects that I know are created as part of many
>>> requests.
>>>
>>> Darragh
>>>
>>> On Wed, Nov 10, 2010 at 5:03 PM, Jon Masamitsu <jon.masamitsu at oracle.com>
>>> wrote:
>>>       
>>>>  I assume that the Java process does not recover on it's
>>>> own and has to be killed?
>>>>
>>>> What are the column heading for your jstat output?
>>>>
>>>> Turn on GC logging (if you don't already have it on) and check to
>>>> see if your perm gen is full.
>>>>
>>>> If your perm gen is not full but the Java heap appears to be
>>>> full, then the garbage collector just thinks that all that data
>>>> is live.  You used jmap.  Do you know what's filling up
>>>> the heap?
>>>>
>>>>
>>>>
>>>> On 11/10/2010 7:47 AM, Darragh Curran wrote:
>>>>         
>>>>> Hi,
>>>>>
>>>>> I hope someone can help me understand this better.
>>>>>
>>>>> I'm running java version "Java(TM) SE Runtime Environment (build
>>>>> 1.6.0_19-b04) Java HotSpot(TM) Server VM (build 16.2-b04, mixed mode)"
>>>>>
>>>>> It runs tomcat, with jvm heap options '-Xmx1600m -Xms1600m'
>>>>>
>>>>> Every few weeks a host becomes unavailable for requests. When I look
>>>>> into it, the java process is using 100% CPU, seems to have no running
>>>>> threads when I do multiple stack dumps, and based on jstat output
>>>>> appears to spend all it's time doing full gc runs:
>>>>>
>>>>> 14400.0 13824.0  0.0    0.0   517888.0 517888.0 1092288.0  1060014.1
>>>>> 37376.0 37182.0  38654 1288.741 1107  5656.903 6945.644
>>>>> 14400.0 13824.0  0.0    0.0   517888.0 517888.0 1092288.0  1060014.1
>>>>> 37376.0 37182.0  38654 1288.741 1108  5666.547 6955.287
>>>>> 14400.0 13824.0  0.0    0.0   517888.0 517888.0 1092288.0  1060014.1
>>>>> 37376.0 37182.0  38654 1288.741 1108  5666.547 6955.287
>>>>> 14400.0 13824.0  0.0    0.0   517888.0 517888.0 1092288.0  1060014.1
>>>>> 37376.0 37182.0  38654 1288.741 1108  5666.547 6955.287
>>>>> 14400.0 13824.0  0.0    0.0   517888.0 517888.0 1092288.0  1060014.1
>>>>> 37376.0 37182.0  38654 1288.741 1108  5666.547 6955.287
>>>>> 14400.0 13824.0  0.0    0.0   517888.0 517888.0 1092288.0  1060014.1
>>>>> 37376.0 37182.0  38654 1288.741 1108  5666.547 6955.287
>>>>> 14400.0 13824.0  0.0    0.0   517888.0 517888.0 1092288.0  1060014.1
>>>>> 37376.0 37182.0  38654 1288.741 1108  5666.547 6955.287
>>>>> 14400.0 13824.0  0.0    0.0   517888.0 517888.0 1092288.0  1060014.1
>>>>> 37376.0 37182.0  38654 1288.741 1108  5666.547 6955.287
>>>>> 14400.0 13824.0  0.0    0.0   517888.0 517888.0 1092288.0  1060014.1
>>>>> 37376.0 37182.0  38654 1288.741 1108  5666.547 6955.287
>>>>> 14400.0 13824.0  0.0    0.0   517888.0 517888.0 1092288.0  1060014.1
>>>>> 37376.0 37182.0  38654 1288.741 1108  5666.547 6955.287
>>>>> 14400.0 13824.0  0.0    0.0   517888.0 517888.0 1092288.0  1060014.1
>>>>> 37376.0 37182.0  38654 1288.741 1109  5675.945 6964.686
>>>>> 14400.0 13824.0  0.0    0.0   517888.0 517888.0 1092288.0  1060014.1
>>>>> 37376.0 37182.0  38654 1288.741 1109  5675.945 6964.686
>>>>>
>>>>> I used jmap to dump the heap for analysis. When I analyse it I see
>>>>> that 94% of objects are unreachable, but not yet collected.
>>>>>
>>>>>           
>>>>>> From jstat it appears that gc is running roughly every 10 seconds and
>>>>>>             
>>>>> lasting approx 10 seconds, but fails to free any memory.
>>>>>
>>>>> I'd really appreciate some advise on how to better understand my
>>>>> problem and what to do to try and fix it.
>>>>>
>>>>> Thanks,
>>>>> Darragh
>>>>> _______________________________________________
>>>>> hotspot-gc-use mailing list
>>>>> hotspot-gc-use at openjdk.java.net
>>>>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>>>>>           
>>>> _______________________________________________
>>>> hotspot-gc-use mailing list
>>>> hotspot-gc-use at openjdk.java.net
>>>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>>>>
>>>>         
>>> _______________________________________________
>>> hotspot-gc-use mailing list
>>> hotspot-gc-use at openjdk.java.net
>>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>>>       
_______________________________________________
hotspot-gc-use mailing list
hotspot-gc-use at openjdk.java.net
http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use



More information about the hotspot-gc-dev mailing list