Strange long ParNew collections

Thu May 7 09:10:22 PDT 2009

Kaj,

iCMS may have worked better for you because it might
kick off a collection earlier.  But iCMS was intended for
use on 1 or 2 processor machines where doing work
concurrently would hog one (maybe the only one)
processor so would not feel very concurrent to the
application.  I think your use of the lower initiating
occupancy would have gotten you the result you needed.
iCMS generally runs longer and leaves more floating
garbage (dead objects that are not recognized as
garbage until the next collection).  I'd suggest that you
do some testing without the incremental mode and
with the larger survivor spaces and tenuring threshold
and see how that does.  The concurrent mode failures
can often be reduced when the survivor spaces are
used.

Jon

Kaj Nygren wrote:

>Hi Jon,
>
>Thanks for your input!
>
>Well it was around a year ago that we turned on incremental mode, but if 
>memory serves me right we had problems with getting "Concurrent Mode 
>Failure" rather often (probably CMS kicking in to late and not keeping up).
>
>It is an 8 processor machine but the application is keeping it rather 
>busy. We searched around for possible solutions and found several 
>sources that recommended running iCMS, and tuning the threshold to start 
>CMS gc earlier. We turned on iCMS and that helped. The application has 
>changed a bit since then so maybe that's not needed anymore though.  
>(This might have been a bogus change by us though, gc internals is still 
>a bit voodoo for me, although I've tried to read up as much as I can 
>about it :)
>
>We will start by specifying a SurvivorRatio and a TenuringThreshold, and 
>see if that helps. After that if that's needed I'll try increasing the 
>spaces as well.
>
>/Kaj
>
>
>Jon Masamitsu skrev:
>  
>
>>Kaj,
>>
>>I don't see anything in the logs that points to
>>a specific problem.
>>
>>Why are you using CMSIncrementalMode?
>>
>>By default the CMS collector promotes all
>>objects that survive 1 minor collection
>>(into the tenured generation).  You might
>>find it useful to use larger survivor
>>spaces and a larger tenuring threshold.
>>See
>>
>>http://blogs.sun.com/jonthecollector/entry/the_fault_with_defaults
>>
>>for some suggestions of values to try.  You might also benefit
>>from a larger young.  Have you tried larger values? 2g? Or maybe
>>even 4g
>>
>>Jon
>>
>>On 05/06/09 05:06, Kaj Nygren wrote:
>>  
>>    
>>
>>>Hi,
>>>
>>>We are having problems with occational long ParNew pauses in our 
>>>application running on a Jboss server. The server runs well for most of 
>>>the time but sometimes we get ParNew collections that can take up to 120 
>>>seconds. Most of the time we have several consecutive slow collections, 
>>>which are more or less killing the server under high load.
>>>
>>>We are using JDK 1.5.0_16 on Windows Server 2003, and are using ParNew + 
>>>Concurrent CMS.
>>>
>>>Our GC configuration is:
>>>
>>>wrapper.java.additional.1=-Dprogram.name=run.bat
>>>wrapper.java.additional.2=-server
>>>wrapper.java.additional.3=-Xms19000m
>>>wrapper.java.additional.4=-Xmx19000m
>>>wrapper.java.additional.5="-XX:NewSize=1000m"
>>>wrapper.java.additional.6="-XX:MaxNewSize=1000m"
>>>wrapper.java.additional.7="-XX:MaxPermSize=256m"
>>>wrapper.java.additional.8="-XX:PermSize=256m"
>>>wrapper.java.additional.11="-Xloggc:c:\mygclog.gc"
>>>wrapper.java.additional.12="-XX:+UseConcMarkSweepGC"
>>>wrapper.java.additional.13="-XX:+PrintGCDetails"
>>>wrapper.java.additional.14="-XX:-PrintTenuringDistribution"
>>>wrapper.java.additional.15="-XX:+PrintHeapAtGC"
>>>wrapper.java.additional.16="-XX:+DisableExplicitGC"
>>>wrapper.java.additional.17="-XX:+PrintGCTimeStamps"
>>>wrapper.java.additional.18="-XX:CMSInitiatingOccupancyFraction=50"
>>>wrapper.java.additional.19="-XX:+CMSIncrementalMode"
>>>wrapper.java.additional.20="-XX:+CMSIncrementalPacing"
>>>wrapper.java.additional.21="-verbose:gc"
>>>
>>>Usually ParNew is quick as can be seen here:
>>>
>>>614276,653: [ParNew: 935058K->0K(1023040K), 0,0624699 secs] 
>>>12383872K->11459948K(19455040K) icms_dc=22 Heap after gc invocations=57822:
>>>par new generation   total 1023040K, used 0K [0x000000007fff0000, 
>>>0x00000000be7f0000, 0x00000000be7f0000)
>>> eden space 1022080K,   0% used [0x000000007fff0000, 0x000000007fff0000, 
>>>0x00000000be610000)
>>> from space 960K,   0% used [0x00000000be700000, 0x00000000be700000, 
>>>0x00000000be7f0000)
>>> to   space 960K,   0% used [0x00000000be610000, 0x00000000be610000, 
>>>0x00000000be700000)
>>>concurrent mark-sweep generation total 18432000K, used 11459948K 
>>>[0x00000000be7f0000, 0x00000005237f0000, 0x00000005237f0000)
>>>concurrent-mark-sweep perm gen total 262144K, used 112069K 
>>>[0x00000005237f0000, 0x00000005337f0000, 0x00000005337f0000)
>>>}
>>>, 0,0628860 secs]
>>>
>>>But suddenly it becomes really slow:
>>>
>>>620659,717: [ParNew: 1022080K->0K(1023040K), 1,3313817 secs] 
>>>12688913K->11692925K(19455040K) icms_dc=10 Heap after gc invocations=58587:
>>>par new generation   total 1023040K, used 0K [0x000000007fff0000, 
>>>0x00000000be7f0000, 0x00000000be7f0000)
>>> eden space 1022080K,   0% used [0x000000007fff0000, 0x000000007fff0000, 
>>>0x00000000be610000)
>>> from space 960K,   0% used [0x00000000be610000, 0x00000000be610000, 
>>>0x00000000be700000)
>>> to   space 960K,   0% used [0x00000000be700000, 0x00000000be700000, 
>>>0x00000000be7f0000)
>>>concurrent mark-sweep generation total 18432000K, used 11692925K 
>>>[0x00000000be7f0000, 0x00000005237f0000, 0x00000005237f0000)
>>>concurrent-mark-sweep perm gen total 262144K, used 112098K 
>>>[0x00000005237f0000, 0x00000005337f0000, 0x00000005337f0000)
>>>}
>>>, 1,3319554 secs]
>>>620663,221: [GC {Heap before gc invocations=58587:
>>>par new generation   total 1023040K, used 1022080K [0x000000007fff0000, 
>>>0x00000000be7f0000, 0x00000000be7f0000)
>>> eden space 1022080K, 100% used [0x000000007fff0000, 0x00000000be610000, 
>>>0x00000000be610000)
>>> from space 960K,   0% used [0x00000000be610000, 0x00000000be610000, 
>>>0x00000000be700000)
>>> to   space 960K,   0% used [0x00000000be700000, 0x00000000be700000, 
>>>0x00000000be7f0000)
>>>concurrent mark-sweep generation total 18432000K, used 11692925K 
>>>[0x00000000be7f0000, 0x00000005237f0000, 0x00000005237f0000)
>>>concurrent-mark-sweep perm gen total 262144K, used 112106K 
>>>[0x00000005237f0000, 0x00000005337f0000, 0x00000005337f0000)
>>>620663,222: [ParNew: 1022080K->0K(1023040K), 10,2909893 secs] 
>>>12715005K->11936864K(19455040K) icms_dc=10 Heap after gc invocations=58588:
>>>par new generation   total 1023040K, used 0K [0x000000007fff0000, 
>>>0x00000000be7f0000, 0x00000000be7f0000)
>>> eden space 1022080K,   0% used [0x000000007fff0000, 0x000000007fff0000, 
>>>0x00000000be610000)
>>> from space 960K,   0% used [0x00000000be700000, 0x00000000be700000, 
>>>0x00000000be7f0000)
>>> to   space 960K,   0% used [0x00000000be610000, 0x00000000be610000, 
>>>0x00000000be700000)
>>>concurrent mark-sweep generation total 18432000K, used 11936864K 
>>>[0x00000000be7f0000, 0x00000005237f0000, 0x00000005237f0000)
>>>concurrent-mark-sweep perm gen total 262144K, used 112106K 
>>>[0x00000005237f0000, 0x00000005337f0000, 0x00000005337f0000)
>>>}
>>>, 10,2915548 secs]
>>>620676,392: [GC {Heap before gc invocations=58588:
>>>par new generation   total 1023040K, used 1022080K [0x000000007fff0000, 
>>>0x00000000be7f0000, 0x00000000be7f0000)
>>> eden space 1022080K, 100% used [0x000000007fff0000, 0x00000000be610000, 
>>>0x00000000be610000)
>>> from space 960K,   0% used [0x00000000be700000, 0x00000000be700000, 
>>>0x00000000be7f0000)
>>> to   space 960K,   0% used [0x00000000be610000, 0x00000000be610000, 
>>>0x00000000be700000)
>>>concurrent mark-sweep generation total 18432000K, used 11936864K 
>>>[0x00000000be7f0000, 0x00000005237f0000, 0x00000005237f0000)
>>>concurrent-mark-sweep perm gen total 262144K, used 112106K 
>>>[0x00000005237f0000, 0x00000005337f0000, 0x00000005337f0000)
>>>620676,392: [ParNew: 1022080K->0K(1023040K), 3,8905950 secs] 
>>>12958944K->12020225K(19455040K) icms_dc=10 Heap after gc invocations=58589:
>>>par new generation   total 1023040K, used 0K [0x000000007fff0000, 
>>>0x00000000be7f0000, 0x00000000be7f0000)
>>> eden space 1022080K,   0% used [0x000000007fff0000, 0x000000007fff0000, 
>>>0x00000000be610000)
>>> from space 960K,   0% used [0x00000000be610000, 0x00000000be610000, 
>>>0x00000000be700000)
>>> to   space 960K,   0% used [0x00000000be700000, 0x00000000be700000, 
>>>0x00000000be7f0000)
>>>concurrent mark-sweep generation total 18432000K, used 12020225K 
>>>[0x00000000be7f0000, 0x00000005237f0000, 0x00000005237f0000)
>>>concurrent-mark-sweep perm gen total 262144K, used 112106K 
>>>[0x00000005237f0000, 0x00000005337f0000, 0x00000005337f0000)
>>>}
>>>, 3,8911113 secs]
>>>620682,667: [GC {Heap before gc invocations=58589:
>>>par new generation   total 1023040K, used 1022080K [0x000000007fff0000, 
>>>0x00000000be7f0000, 0x00000000be7f0000)
>>> eden space 1022080K, 100% used [0x000000007fff0000, 0x00000000be610000, 
>>>0x00000000be610000)
>>> from space 960K,   0% used [0x00000000be610000, 0x00000000be610000, 
>>>0x00000000be700000)
>>> to   space 960K,   0% used [0x00000000be700000, 0x00000000be700000, 
>>>0x00000000be7f0000)
>>>concurrent mark-sweep generation total 18432000K, used 12020225K 
>>>[0x00000000be7f0000, 0x00000005237f0000, 0x00000005237f0000)
>>>concurrent-mark-sweep perm gen total 262144K, used 112112K 
>>>[0x00000005237f0000, 0x00000005337f0000, 0x00000005337f0000)
>>>620682,668: [ParNew: 1022080K->0K(1023040K), 9,4186088 secs] 
>>>13042305K->12214571K(19455040K) icms_dc=10 Heap after gc invocations=58590:
>>>par new generation   total 1023040K, used 0K [0x000000007fff0000, 
>>>0x00000000be7f0000, 0x00000000be7f0000)
>>> eden space 1022080K,   0% used [0x000000007fff0000, 0x000000007fff0000, 
>>>0x00000000be610000)
>>> from space 960K,   0% used [0x00000000be700000, 0x00000000be700000, 
>>>0x00000000be7f0000)
>>> to   space 960K,   0% used [0x00000000be610000, 0x00000000be610000, 
>>>0x00000000be700000)
>>>concurrent mark-sweep generation total 18432000K, used 12214571K 
>>>[0x00000000be7f0000, 0x00000005237f0000, 0x00000005237f0000)
>>>concurrent-mark-sweep perm gen total 262144K, used 112112K 
>>>[0x00000005237f0000, 0x00000005337f0000, 0x00000005337f0000)
>>>}
>>>, 9,4191551 secs]
>>>
>>>We also have problems with occational promotion failed, which I assume 
>>>is due to fragmentation.
>>>I have attatched a longer log, which shows this behaviour and ends with 
>>>a promotion failed.
>>>
>>>Does anybody have a glue as to what is going wrong, and how we can 
>>>change our gc configuration to reduce the problems?
>>>Any help is greatly appreciated!
>>>
>>>Regards
>>>
>>>Kaj Nygren
>>>kaj at mag.se
>>>
>>>
>>>------------------------------------------------------------------------
>>>
>>>_______________________________________________
>>>hotspot-gc-use mailing list
>>>hotspot-gc-use at openjdk.java.net
>>>http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>>>    
>>>      
>>>
>>_______________________________________________
>>hotspot-gc-use mailing list
>>hotspot-gc-use at openjdk.java.net
>>http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>>  
>>    
>>
>
>_______________________________________________
>hotspot-gc-use mailing list
>hotspot-gc-use at openjdk.java.net
>http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>  
>