Hung JVM consuming 100% CPU

Tue Mar 1 13:12:40 PST 2011

On 03/01/11 09:26, Charles K Pepperdine wrote:
> Tenured is 524288K of which 478087K is occupied, greater than 90% I've 
> just been recently poking about in the source trying to sort out how 
> logs are being printed. I've seen that partial fragments of the log 
> are printed and that the  " [CMS" fragment is printed only after a CMS 
> has been triggered. So, the problem is bounded by the log messages 
> (sans a stray pointer bug).
>
> My question is, shouldn't a 90% occupancy of tenured trigger a full GC?

The CMS concurrent collections are fast from what I've seen (on the order of
the time between ParNew collections).   The rate at which objects are 
getting
promoted is also low (maybe 3m per ParNew collection).  CMS thinks it
can wait to start a concurrent collection.  The fact that a "promotion 
failure"
happened makes it look like fragmentation.

>
> Regards,
> Kirk
>
>
> On Mar 1, 2011, at 5:29 PM, Jon Masamitsu wrote:
>
>> For 6692906 to be the problem there needs to be a
>> CMS concurrent phase in progress (marking, precleaning or
>> sweeping) and a minor collection running (with
>> UseParNewGC in use).  From the fragment of the gc log
>> I could not tell for sure (maybe it was in the ... removed)
>> but I don't think a concurrent phase was in progress
>> so I would say it is not 6692906).  Did you try
>> -XX:-UseParNewGC as was suggested?  Your minor
>> pauses are not particularly long so maybe you
>> could afford to try it.  6692906 will not happen
>> without UseParNewGC.   Note you need to turn of
>> UseParNewGC as it is the default for CMS.
>>
>> Look back through the log for any other
>> ParNew (promotion failed) and see what happens
>> in those cases (in you find one).  2+ hours is too
>> long.
>>
>> The gentleman who would know best about this code
>> is out of the office until the end of the week.  I'll talk
>> to him about this to see if he remembers a recent
>> fix that I don't.
>>
>>
>> On 03/01/11 01:38, Bogdan Dimitriu wrote:
>>> Hi guys,
>>>
>>> We're having a problem with garbage collection as described here:
>>> http://forums.oracle.com/forums/message.jspa?messageID=9345173 (I
>>> apologise if posting links is not the right policy, but I prefer not to
>>> duplicate data).
>>>
>>> We are going to try an upgrade to JRE 6u24 soon, but reading the release
>>> notes for each of the versions since 6u20, I don't have much hope of
>>> this upgrade fixing the problem.
>>>
>>> I have searched a bit on the Java bugs database and I've come across
>>> something that looks similar to the problem I am experiencing:
>>> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6692906, but I'm not
>>> convinced this is exactly the same issue. This bug it seems will be
>>> fixed in 6u25 (which I've read will be released late March or early 
>>> April).
>>>
>>> The reason I'm leaning towards thinking this is a JVM bug is the fact
>>> that the JVM can stay in the hung state (as described on the forum) for
>>> 2+ hours until we kill the process.
>>>
>>> I was hoping to get an idea about this from the source :), so any hints
>>> will be greatly appreciated.
>>>
>>> Thanks,
>>> Bogdan
>>> _______________________________________________
>>> hotspot-gc-use mailing list
>>> hotspot-gc-use at openjdk.java.net <mailto:hotspot-gc-use at openjdk.java.net>
>>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>> _______________________________________________
>> hotspot-gc-use mailing list
>> hotspot-gc-use at openjdk.java.net <mailto:hotspot-gc-use at openjdk.java.net>
>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20110301/56ae70c8/attachment.html